Meta-ecosystems have been studied looking at meta-ecosystems in which patch size was the same. However, of course, we know that meta-ecosystems are mad out of patches that have different size. To see the effects of patch size on meta-ecosystem properties, we ran a four weeks protist experiment in which different ecosystems were connected through the flow of nutrients. The flow of nutrients resulted from a perturbation of the ecosystems in which a fixed part of the cultures was boiled and then poored into the receiving patch. This had a fixed volume (e.g., small perturbation = 6.75 ml) and was the same across all patch sizes. The experiment design consisted in crossing two disturbances with a small, medium, and large isolated ecosystems and with a small-small, medium-medium, large-large, and small-large meta-ecosystem. We took videos every four days and we create this perturbation and resource flow the day after taking videos. We skipped the perturbation the day after we assembled the experiment so that we would start perturbing it when population densities were already high.
We had mainly two research questions:
Do local properties of a patch depend upon the size of the patch it is connected to?
Do regional properties of a meta-ecosystem depend upon the relative size of its patch?
23/3/22 PPM for increasing the number of monocultures in the collection.
24/3/22 Collection control. See monoculture maintenance lab book p. 47.
26/3/22 Increase of number of monocultures in the collection. To do so, take the best culture and make 3 new ones. See monoculture maintenance lab book p. 47.
1/4/22 Make PPM for high density monocultures. See PatchSizePilot lab book p. 5.
3/4/22 Make bacterial solution for high density monocultures. See PatchSizePilot lab book p. 8.
5/4/22 Grow high density monocultures. Make 3 high density monocultures for each protist species with 200 ml with 5% bacterial solution, 85% PPM, 10% protists, and 2 seeds. See PatchSizePilot lab book p. 10
10/4/2022 Check high density monocultures. Cep, Eup, Spi, Spi te were really low.
13/4/2022 Start of the experiment. See PatchSizePilot p. 33.
- Autoclave all the material in advance
- Get more high-density monocultures
- Decide in advance the days in which you are going to check the high-density monocultures and prepare bacteria in advance for that day so that if some of them crashed you are still on time to make new ones.
- Use a single lab book for also when you create PPM and check the collection.
- Make a really high amount of PPM, as you will need for so many different things (>10 L). Maybe also autoclave 1 L Schott bottles so that you don’t have to oxygenate whole 5 L bottles of PPM. I think that I should have maybe made even a 10 L bottle of PPM.
- According to Silvana protists take 4-7 days to grow. The fastest is Tet (ca 4 days) and the slowest is Spi (ca 7 days). Once that you grow them they should stay at carrying capacity for a bit of time I guess, as you can see in the monoculture collection. I should make sure I’m growing them in the right way. I think that maybe I should grow them 10 days in advance so that I could actually grow also the slow species if they crashed. What should I do if all of them crashed?
To build the mixed effect models we will use the R package lme4. See page 6 of this PDF to know more about the syntaxis of this package and this link for the interaction syntaxis.
To do model diagnostics of mixed effect models, I’m going to look at the following two plots (as suggested by Zuur et al. (2009), page 487):
Quantile-quantile plots (plot(mixed_model))
Partial residual plots
(qqnorm(resid(mixed_model)))
The effect size of the explaining variables is calculated in the
mixed effect models as marginal and conditional r squared. The marginal
r squared is how much variance is explained by the fixed effects. The
conditional r squared is how much variance is explained by the fixed and
the random effects. The marginal and conditional r squared are
calculated using the package MuMIn. The computation is
based on the methods of Nakagawa, Johnson, and
Schielzeth (2017). For the coding and interpretation of these r
squared check the documentation
for the r.squaredGLMM function
Time can be included as a fixed or random effect. Time can be included as a random effect if the different data points are non independent from each other (e.g., seasons). However, because the biomass in our experiment was following a temporal trend, the different time points show autocorrelation. In other words, t2 is more similar to t3 than t4 and so on. This is why we decided to include time as a fixed effect. For an excellent discussion on this topic see this blog post.
I am going to select the best model according to AIC. Halsey (2019) suggests this approach instead of p values. P-values are not a reliable way of choosing a model because:
My sample size is small, producing larger p-values
P-values are really variable, creating many false positives and negatives (e.g., if p=0.05 there is a 1 in 3 chance that it’s a false positive)
To study the local biomass how it changes across treatments, we
could have made three different models between the three combinations of
small patches. However, that might be confusing to interpret the
results. We decided instead to use an effect size where we control is
the isolated small patch. At the beginning we thought to use the natural
logarithm of the response ratio (lnRR). The problem, however, is that
some bioarea values were 0. We were thinking to add 1 to all null
values, but according to Rosenberg, Rothstein,
and Gurevitch (2013), such practice inflates effect sizes.
Because of this, I looked into other types of effect size. I found that
the most common and preferred metric in use today is known as Hedge’s d
(a.k.a. Hedge’s g) (Hedges, Larry V. and Olkin
(1985) ). It is calculated as the difference in mean between
treatment and control divided by the standard deviation of the pooled
data. Another measure would be Cohen’s d, but it underperforms with
sample sizes that are lower than 20 (StatisticsHowTo). I
can easily calculate the Hedge’s d using the r package
effsize.
Same thing for the large patches.
Single time points cannot be computed for effect sizes just because I have a single data point as response variable (e.g., one for S (S_S) at low disturbance at time point = 2). It would have been nice to show the difference in treatments at specific time points visually through confidence intervals. However, the confidence interval cannot be computed for the last two time points because there are too many zeros and their confidence interval computation produces a lot of Nan, Inf, and -Inf. At this point it doesn’t make sense to show just a subset of time points and we decided to opt out from the visualisation of confidence intervals.
culture_info)This table contains information about the 110 cultures of the experiment.
culture_info = read.csv(here("data", "PatchSizePilot_culture_info.csv"), header = TRUE)
datatable(culture_info[,1:10],
rownames = FALSE,
options = list(scrollX = TRUE),
filter = list(position = 'top',
clear = FALSE))
ds_biomass_abund)This dataset is the master dataset containing all the information about the biomass of the experiment.
load(here("data", "population", "t0.RData")); t0 = pop_output
load(here("data", "population", "t1.RData")); t1 = pop_output
load(here("data", "population", "t2.RData")); t2 = pop_output
load(here("data", "population", "t3.RData")); t3 = pop_output
load(here("data", "population", "t4.RData")); t4 = pop_output
load(here("data", "population", "t5.RData")); t5 = pop_output
load(here("data", "population", "t6.RData")); t6 = pop_output
load(here("data", "population", "t7.RData")); t7 = pop_output
rm(pop_output)
#Column: time
t0$time = NA
t1$time = NA
#Column: replicate_video
t0$replicate_video = 1:12 #In t1 I took 12 videos of a single
t1$replicate_video = 1 #In t1 I took only 1 video/culture
t2$replicate_video = 1 #In t2 I took only 1 video/culture
t3$replicate_video = 1 #In t3 I took only 1 video/culture
t4$replicate_video = 1 #In t4 I took only 1 video/culture
t5$replicate_video = 1 #In t5 I took only 1 video/culture
t6 = t6 %>%
rename(replicate_video = replicate)
t7 = t7 %>%
rename(replicate_video = replicate)
#Elongate t0 (so that it can be merged wiht culture_info)
number_of_columns_t0 = ncol(t0)
nr_of_cultures = nrow(culture_info)
nr_of_videos = nrow(t0)
t0 = t0[rep(row.names(t0), nr_of_cultures),] %>%
arrange(file) %>%
mutate(culture_ID = rep(1:nr_of_cultures, times = nr_of_videos))
#Merge time points
t0 = merge(culture_info,t0, by="culture_ID")
t1 = merge(culture_info,t1, by = "culture_ID")
t2 = merge(culture_info,t2, by = "culture_ID")
t3 = merge(culture_info,t3, by = "culture_ID")
t4 = merge(culture_info,t4, by = "culture_ID")
t5 = merge(culture_info,t5, by = "culture_ID")
t6 = merge(culture_info,t6, by = "culture_ID")
t7 = merge(culture_info,t7, by = "culture_ID")
ds_biomass_abund = rbind(t0, t1, t2, t3, t4, t5, t6, t7)
rm(t0, t1, t2, t3, t4, t5, t6, t7)
#Take off spilled cultures
ds_biomass_abund = ds_biomass_abund %>%
filter(! culture_ID %in% ecosystems_to_take_off)
#Column: time_point
ds_biomass_abund$time_point[ds_biomass_abund$time_point=="t0"] = 0
ds_biomass_abund$time_point[ds_biomass_abund$time_point=="t1"] = 1
ds_biomass_abund$time_point[ds_biomass_abund$time_point=="t2"] = 2
ds_biomass_abund$time_point[ds_biomass_abund$time_point=="t3"] = 3
ds_biomass_abund$time_point[ds_biomass_abund$time_point=="t4"] = 4
ds_biomass_abund$time_point[ds_biomass_abund$time_point=="t5"] = 5
ds_biomass_abund$time_point[ds_biomass_abund$time_point=="t6"] = 6
ds_biomass_abund$time_point[ds_biomass_abund$time_point=="t7"] = 7
ds_biomass_abund$time_point = as.character(ds_biomass_abund$time_point)
#Column: day
ds_biomass_abund$day = NA
ds_biomass_abund$day[ds_biomass_abund$time_point== 0] = 0
ds_biomass_abund$day[ds_biomass_abund$time_point== 1] = 4
ds_biomass_abund$day[ds_biomass_abund$time_point== 2] = 8
ds_biomass_abund$day[ds_biomass_abund$time_point== 3] = 12
ds_biomass_abund$day[ds_biomass_abund$time_point== 4] = 16
ds_biomass_abund$day[ds_biomass_abund$time_point== 5] = 20
ds_biomass_abund$day[ds_biomass_abund$time_point== 6] = 24
ds_biomass_abund$day[ds_biomass_abund$time_point== 7] = 28
#Column: size_of_connected_patch
ds_biomass_abund$size_of_connected_patch[ds_biomass_abund$eco_metaeco_type == "S"] = "S"
ds_biomass_abund$size_of_connected_patch[ds_biomass_abund$eco_metaeco_type == "S (S_S)"] = "S"
ds_biomass_abund$size_of_connected_patch[ds_biomass_abund$eco_metaeco_type == "S (S_L)"] = "L"
ds_biomass_abund$size_of_connected_patch[ds_biomass_abund$eco_metaeco_type == "M (M_M)"] = "M"
ds_biomass_abund$size_of_connected_patch[ds_biomass_abund$eco_metaeco_type == "L"] = "L"
ds_biomass_abund$size_of_connected_patch[ds_biomass_abund$eco_metaeco_type == "L (L_L)"] = "L"
ds_biomass_abund$size_of_connected_patch[ds_biomass_abund$eco_metaeco_type == "L (S_L)"] = "S"
#Column: bioarea_tot & biomass_tot
ds_biomass_abund = ds_biomass_abund %>%
mutate(bioarea_tot = indiv_per_volume * patch_size_volume * 1000) %>% #Bioarea per volume is in micromitre, patch_size volume is in ml
mutate(indiv_tot = indiv_per_volume * patch_size_volume * 1000)
#Keep this dataset for the evaporation effects
ds_for_evaporation = ds_biomass_abund
ds_biomass_abund = ds_biomass_abund %>%
select(culture_ID,
patch_size,
patch_size_volume,
disturbance,
metaecosystem_type,
indiv_per_volume,
replicate_video,
time_point,
day,
metaecosystem,
system_nr,
eco_metaeco_type,
size_of_connected_patch,
bioarea_per_volume,
bioarea_tot,
indiv_per_volume,
indiv_tot) %>%
relocate(culture_ID,
system_nr,
disturbance,
time_point,
day,
patch_size,
patch_size_volume,
metaecosystem,
metaecosystem_type,
eco_metaeco_type,
size_of_connected_patch,
replicate_video,
bioarea_per_volume,
bioarea_tot,
indiv_per_volume,
indiv_tot)
datatable(ds_biomass_abund,
rownames = FALSE,
options = list(scrollX = TRUE),
filter = list(position = 'top',
clear = FALSE))
ds_effect_size_bioarea_density)eco_metaeco_types = unique(ds_biomass_abund$eco_metaeco_type)
small_patches = c("S", "S (S_S)", "S (S_L)")
medium_patches = c("M", "M (M_M)")
large_patches = c("L", "L (L_L)", "L (S_L)")
averaged_bioarea_density = ds_biomass_abund %>%
group_by(disturbance, eco_metaeco_type, time_point, culture_ID, patch_size, day) %>%
summarise(bioarea_per_volume_across_videos = mean(bioarea_per_volume)) %>%
group_by(disturbance, eco_metaeco_type, time_point, patch_size, day) %>%
summarise(mean_bioarea_density = mean(bioarea_per_volume_across_videos),
sd_bioarea_density = sd(bioarea_per_volume_across_videos),
sam_size_bioarea_density = n()) %>%
mutate(mean_bioarea_density_isolated = NA,
sd_bioarea_density_isolated = NA,
sam_size_bioarea_density_isolated = NA)
for (disturbance_input in c("low", "high")){
for (eco_metaeco_input in eco_metaeco_types){
for (time_point_input in 0:7){
if (eco_metaeco_input %in% small_patches){isolated_patch_type = "S"}
if (eco_metaeco_input %in% medium_patches){isolated_patch_type = "M"}
if (eco_metaeco_input %in% large_patches){isolated_patch_type = "L"}
control_row = averaged_bioarea_density %>%
filter(eco_metaeco_type == isolated_patch_type) %>%
filter(disturbance == disturbance_input) %>%
filter(time_point == time_point_input)
averaged_bioarea_density$mean_bioarea_density_isolated[
averaged_bioarea_density$eco_metaeco_type == eco_metaeco_input &
averaged_bioarea_density$disturbance == disturbance_input &
averaged_bioarea_density$time_point == time_point_input] =
control_row$mean_bioarea_density
averaged_bioarea_density$sd_bioarea_density_isolated[
averaged_bioarea_density$eco_metaeco_type == eco_metaeco_input &
averaged_bioarea_density$disturbance == disturbance_input &
averaged_bioarea_density$time_point == time_point_input] =
control_row$sd_bioarea_density
averaged_bioarea_density$sam_size_bioarea_density_isolated[
averaged_bioarea_density$eco_metaeco_type == eco_metaeco_input &
averaged_bioarea_density$disturbance == disturbance_input &
averaged_bioarea_density$time_point == time_point_input] =
control_row$sam_size_bioarea_density
}}}
ds_effect_size_bioarea_density = averaged_bioarea_density %>%
mutate(bioarea_density_hedges_d = calculate.hedges_d(mean_bioarea_density,
sd_bioarea_density,
sam_size_bioarea_density,
mean_bioarea_density_isolated,
sd_bioarea_density_isolated,
sam_size_bioarea_density_isolated)) %>%
mutate(bioarea_density_lnRR = ln(mean_bioarea_density / mean_bioarea_density_isolated))
datatable(ds_effect_size_bioarea_density,
rownames = FALSE,
options = list(scrollX = TRUE),
filter = list(position = 'top',
clear = FALSE))
ds_regional_biomass)This is the dataset of the regional biomass of different
meta-ecosystems. It contains also the regional biomass of the
combination of a small isolated and a large isolated patch
(S_L_from_isolated).
ds_regional_biomass = ds_biomass_abund %>%
filter(metaecosystem == "yes") %>%
filter(! system_nr %in% metaecosystems_to_take_off) %>%
group_by(culture_ID,
system_nr,
disturbance,
time_point,
day,
patch_size,
patch_size_volume,
metaecosystem_type) %>%
summarise(bioarea_per_volume_video_averaged = mean(bioarea_per_volume)) %>%
mutate(total_patch_bioarea = bioarea_per_volume_video_averaged * patch_size_volume) %>%
group_by(system_nr,
disturbance,
time_point,
day,
metaecosystem_type) %>%
summarise(total_regional_bioarea = sum(total_patch_bioarea))
isolated_S_and_L = ds_biomass_abund %>%
filter(eco_metaeco_type == "S" | eco_metaeco_type == "L") %>%
group_by(system_nr, disturbance, time_point, day, eco_metaeco_type) %>%
summarise(bioarea_per_volume_across_videos = mean(bioarea_per_volume))
isolated_S_low = isolated_S_and_L %>%
filter(eco_metaeco_type == "S") %>%
filter(disturbance == "low")
isolated_L_low = isolated_S_and_L %>%
filter(eco_metaeco_type == "L") %>%
filter(disturbance == "low")
isolated_S_high = isolated_S_and_L %>%
filter(eco_metaeco_type == "S") %>%
filter(disturbance == "high")
isolated_L_high = isolated_S_and_L %>%
filter(eco_metaeco_type == "L") %>%
filter(disturbance == "high")
S_low_system_nrs = unique(isolated_S_low$system_nr)
S_high_system_nrs = unique(isolated_S_high$system_nr)
L_low_system_nrs = unique(isolated_L_low$system_nr)
L_high_system_nrs = unique(isolated_L_high$system_nr)
low_system_nrs_combination = expand.grid(S_low_system_nrs, L_low_system_nrs) %>%
mutate(disturbance = "low")
high_system_nrs_combination = expand.grid(S_high_system_nrs, L_high_system_nrs) %>%
mutate(disturbance = "high")
system_nr_combinations = rbind(low_system_nrs_combination, high_system_nrs_combination) %>%
rename(S_system_nr = Var1) %>%
rename(L_system_nr = Var2)
number_of_combinations = nrow(system_nr_combinations)
SL_from_isolated_all_combinations = NULL
for (pair in 1:number_of_combinations){
SL_from_isolated_one_combination =
ds_biomass_abund %>%
filter(system_nr %in% system_nr_combinations[pair,]) %>%
group_by(disturbance, day, time_point, system_nr) %>%
summarise(regional_bioarea_across_videos = mean(bioarea_per_volume)) %>%
group_by(disturbance, day, time_point) %>%
summarise(total_regional_bioarea = sum(regional_bioarea_across_videos)) %>%
mutate(system_nr = 1000 + pair) %>%
mutate(metaecosystem_type = "S_L_from_isolated")
SL_from_isolated_all_combinations[[pair]] = SL_from_isolated_one_combination}
SL_from_isolated_all_combinations_together = NULL
for (combination in 1:number_of_combinations){
SL_from_isolated_all_combinations_together =
rbind(SL_from_isolated_all_combinations_together,
SL_from_isolated_all_combinations[[pair]])}
ds_regional_biomass = rbind(ds_regional_biomass, SL_from_isolated_all_combinations_together)
saveRDS(ds_regional_biomass, file = here("results", "ds_regional_biomass.RData"))
ds_regional_biomass = readRDS(here("results", "ds_regional_biomass.RData"))
datatable(ds_regional_biomass,
rownames = FALSE,
options = list(scrollX = TRUE),
filter = list(position = 'top',
clear = FALSE))
ds_lnRR_community_density)eco_metaeco_types = unique(ds_biomass_abund$eco_metaeco_type)
small_patches = c("S", "S (S_S)", "S (S_L)")
medium_patches = c("M", "M (M_M)")
large_patches = c("L", "L (L_L)", "L (S_L)")
averaged_community_density = ds_biomass_abund %>%
group_by(disturbance, eco_metaeco_type, time_point, culture_ID, patch_size, day) %>%
summarise(indiv_per_volume_across_videos = mean(indiv_per_volume)) %>%
group_by(disturbance, eco_metaeco_type, time_point, patch_size, day) %>%
summarise(mean_community_density = mean(indiv_per_volume_across_videos),
sd_community_density = sd(indiv_per_volume_across_videos),
sam_size_community_density = n())
for (disturbance_input in c("low", "high")){
for (eco_metaeco_input in eco_metaeco_types){
for (time_point_input in 0:7){
if (eco_metaeco_input %in% small_patches){isolated_patch_type = "S"}
if (eco_metaeco_input %in% medium_patches){isolated_patch_type = "M"}
if (eco_metaeco_input %in% large_patches){isolated_patch_type = "L"}
control_row = averaged_community_density %>%
filter(eco_metaeco_type == isolated_patch_type) %>%
filter(disturbance == disturbance_input) %>%
filter(time_point == time_point_input)
averaged_community_density$mean_community_density_isolated[
averaged_community_density$eco_metaeco_type == eco_metaeco_input &
averaged_community_density$disturbance == disturbance_input &
averaged_community_density$time_point == time_point_input] =
control_row$mean_community_density
averaged_community_density$sd_community_density_isolated[
averaged_community_density$eco_metaeco_type == eco_metaeco_input &
averaged_community_density$disturbance == disturbance_input &
averaged_community_density$time_point == time_point_input] =
control_row$sd_community_density
averaged_community_density$sam_size_community_density_isolated[
averaged_community_density$eco_metaeco_type == eco_metaeco_input &
averaged_community_density$disturbance == disturbance_input &
averaged_community_density$time_point == time_point_input] =
control_row$sam_size_community_density
}}}
## Warning: Unknown or uninitialised column: `mean_community_density_isolated`.
## Warning: Unknown or uninitialised column: `sd_community_density_isolated`.
## Warning: Unknown or uninitialised column: `sam_size_community_density_isolated`.
ds_effect_size_community_density = averaged_community_density %>%
mutate(community_density_hedges_d = calculate.hedges_d(mean_community_density,
sd_community_density,
sam_size_community_density,
mean_community_density_isolated,
sd_community_density_isolated,
sam_size_community_density_isolated)) %>%
mutate(community_density_lnRR = ln(mean_community_density/mean_community_density_isolated))
datatable(ds_effect_size_community_density,
rownames = FALSE,
options = list(scrollX = TRUE),
filter = list(position = 'top',
clear = FALSE))
Let’s now import the species ID data. I have no clue if it represents biomass or abundances.
t0_file_name = here("data", "population_species_ID", "species_ID_t0.csv")
t0 = read.csv(t0_file_name, header = TRUE) %>%
mutate(culture_ID = NA, time = NA, replicate = 1:12)
t1_file_name = here("data", "population_species_ID", "species_ID_t1.csv")
t1 = read.csv(t1_file_name, header = TRUE) %>%
mutate(culture_ID = NA, time = NA, replicate = 1)
t2_file_name = here("data", "population_species_ID", "species_ID_t2.csv")
t2 = read.csv(t2_file_name, header = TRUE) %>%
mutate(replicate = 1)
t3_file_name = here("data", "population_species_ID", "species_ID_t3.csv")
t3 = read.csv(t3_file_name, header = TRUE) %>%
mutate(replicate = 1)
t4_file_name = here("data", "population_species_ID", "species_ID_t4.csv")
t4 = read.csv(t4_file_name, header = TRUE) %>%
mutate(replicate = 1)
t5_file_name = here("data", "population_species_ID", "species_ID_t5.csv")
t5 = read.csv(t5_file_name, header = TRUE) %>%
mutate(replicate = 1)
t6_file_name = here("data", "population_species_ID", "species_ID_t6.csv")
t6 = read.csv(t6_file_name, header = TRUE)
t7_file_name = here("data", "population_species_ID", "species_ID_t7.csv")
t7 = read.csv(t7_file_name, header = TRUE)
ds_ID = rbind(t0, t1, t2, t3, t4, t5, t6, t7) %>%
pivot_longer(Ble:Tet,
names_to = "species",
values_to = "abundance")
ds_ID %>%
filter(time_point == "t1") %>%
ggplot(aes(x = species,
y = abundance)) +
geom_boxplot()
In what dataset should I incorporate species ID data? And how?
How do I get the species ID of the single particles? (Write to Lynn)
Look at how to identify Ble by itself.
Look at whether everything is fine in the videos.
Discover what it tells you.
Create dataset.
Discover what it tells you.
Create dataset.
culture_info = read.csv(here("data", "PatchSizePilot_culture_info.csv"), header = TRUE)
load(here("data", "morphology", "t0.RData"));t0 = morph_mvt
load(here("data", "morphology", "t1.RData"));t1 = morph_mvt
load(here("data", "morphology", "t2.RData"));t2 = morph_mvt
load(here("data", "morphology", "t3.RData"));t3 = morph_mvt
load(here("data", "morphology", "t4.RData"));t4 = morph_mvt
load(here("data", "morphology", "t5.RData"));t5 = morph_mvt
load(here("data", "morphology", "t6.RData"));t6 = morph_mvt
load(here("data", "morphology", "t7.RData"));t7 = morph_mvt
rm(morph_mvt)
#Column: time
t0$time = NA
t1$time = NA
#Column: replicate_video
t0$replicate_video[t0$file == "sample_00001"] = 1
t0$replicate_video[t0$file == "sample_00002"] = 2
t0$replicate_video[t0$file == "sample_00003"] = 3
t0$replicate_video[t0$file == "sample_00004"] = 4
t0$replicate_video[t0$file == "sample_00005"] = 5
t0$replicate_video[t0$file == "sample_00006"] = 6
t0$replicate_video[t0$file == "sample_00007"] = 7
t0$replicate_video[t0$file == "sample_00008"] = 8
t0$replicate_video[t0$file == "sample_00009"] = 9
t0$replicate_video[t0$file == "sample_00010"] = 10
t0$replicate_video[t0$file == "sample_00011"] = 11
t0$replicate_video[t0$file == "sample_00012"] = 12
t1$replicate_video = 1 #In t1 I took only 1 video/culture
t2$replicate_video = 1 #In t2 I took only 1 video/culture
t3$replicate_video = 1 #In t3 I took only 1 video/culture
t4$replicate_video = 1 #In t4 I took only 1 video/culture
t5$replicate_video = 1 #In t5 I took only 1 video/culture
t6 = t6 %>% rename(replicate_video = replicate)
t7 = t7 %>% rename(replicate_video = replicate)
cultures_n = max(culture_info$culture_ID)
original_t0_rows = nrow(t0)
ID_vector = rep(1:cultures_n, each = original_t0_rows)
t0 = t0 %>%
slice(rep(1:n(), cultures_n)) %>%
mutate(culture_ID = ID_vector)
t0 = merge(culture_info, t0, by="culture_ID")
t1 = merge(culture_info, t1, by="culture_ID")
t2 = merge(culture_info, t2, by="culture_ID")
t3 = merge(culture_info, t3, by="culture_ID")
t4 = merge(culture_info, t4, by="culture_ID")
t5 = merge(culture_info, t5, by="culture_ID")
t6 = merge(culture_info, t6, by="culture_ID")
t7 = merge(culture_info, t7, by="culture_ID")
ds_body_size = rbind(t0, t1, t2, t3, t4, t5, t6, t7)
rm(t0, t1, t2, t3, t4, t5, t6, t7)
#Column: day
ds_body_size$day = ds_body_size$time_point;
ds_body_size$day[ds_body_size$day=="t0"] = "0"
ds_body_size$day[ds_body_size$day=="t1"] = "4"
ds_body_size$day[ds_body_size$day=="t2"] = "8"
ds_body_size$day[ds_body_size$day=="t3"] = "12"
ds_body_size$day[ds_body_size$day=="t4"] = "16"
ds_body_size$day[ds_body_size$day=="t5"] = "20"
ds_body_size$day[ds_body_size$day=="t6"] = "24"
ds_body_size$day[ds_body_size$day=="t7"] = "28"
ds_body_size$day = as.numeric(ds_body_size$day)
#Column: time point
ds_body_size$time_point[ds_body_size$time_point=="t0"] = 0
ds_body_size$time_point[ds_body_size$time_point=="t1"] = 1
ds_body_size$time_point[ds_body_size$time_point=="t2"] = 2
ds_body_size$time_point[ds_body_size$time_point=="t3"] = 3
ds_body_size$time_point[ds_body_size$time_point=="t4"] = 4
ds_body_size$time_point[ds_body_size$time_point=="t5"] = 5
ds_body_size$time_point[ds_body_size$time_point=="t6"] = 6
ds_body_size$time_point[ds_body_size$time_point=="t7"] = 7
ds_body_size$time_point = as.character(ds_body_size$time_point)
#Select useful columns
ds_body_size = ds_body_size %>%
select(culture_ID,
patch_size,
patch_size_volume,
disturbance,
metaecosystem_type,
mean_area,
replicate_video,
time_point,
day,
metaecosystem,
system_nr,
eco_metaeco_type)
#Reorder columns
ds_body_size = ds_body_size[, c("culture_ID",
"system_nr",
"disturbance",
"time_point",
"day",
"patch_size",
"patch_size_volume",
"metaecosystem",
"metaecosystem_type",
"eco_metaeco_type",
"replicate_video",
"mean_area")]
datatable(ds_body_size,
rownames = FALSE,
options = list(scrollX = TRUE),
filter = list(position = 'top',
clear = FALSE))
## Warning in instance$preRenderHook(instance): It seems your data is too big
## for client-side DataTables. You may consider server-side processing: https://
## rstudio.github.io/DT/server.html
I am here creating 12 size classes as in Jacquet, Gounand, and Altermatt (2020). However, for some reason it seems like our body size classes are really different.
#### --- PARAMETERS & INITIALISATION --- ###
nr_of_size_classes = 12
largest_size = max(ds_body_size$mean_area)
size_class_width = largest_size/nr_of_size_classes
size_class = NULL
### --- CREATE DATASET --- ###
size_class_boundaries = seq(0, largest_size, by = size_class_width)
for (class in 1:nr_of_size_classes){
bin_lower_limit = size_class_boundaries[class]
bin_upper_limit = size_class_boundaries[class+1]
size_input = (size_class_boundaries[class] + size_class_boundaries[class + 1])/2
size_class[[class]] = ds_body_size%>%
filter(bin_lower_limit <= mean_area) %>%
filter(mean_area <= bin_upper_limit) %>%
group_by(culture_ID,
system_nr,
disturbance,
day,
patch_size,
metaecosystem,
metaecosystem_type,
eco_metaeco_type,
replicate_video) %>% #Group by video
summarise(mean_abundance_across_videos = n()) %>%
group_by(culture_ID,
system_nr,
disturbance,
day,
patch_size,
metaecosystem,
metaecosystem_type,
eco_metaeco_type) %>% #Group by ID
summarise(abundance = mean(mean_abundance_across_videos)) %>%
mutate(log_abundance = log(abundance)) %>%
mutate(size_class = class) %>%
mutate(size = size_input) %>%
mutate(log_size = log(size))
}
ds_classes = rbind(size_class[[1]], size_class[[2]], size_class[[3]], size_class[[4]],
size_class[[5]], size_class[[6]], size_class[[7]], size_class[[8]],
size_class[[9]], size_class[[10]], size_class[[11]], size_class[[12]],)
datatable(ds_classes,
rownames = FALSE,
options = list(scrollX = TRUE),
filter = list(position = 'top',
clear = FALSE))
ds_median_body_size)eco_metaeco_types = unique(culture_info$eco_metaeco_type)
ds_median_body_size = ds_body_size %>%
group_by(disturbance,
metaecosystem,
patch_size,
patch_size_volume,
eco_metaeco_type,
culture_ID,
time_point,
day,
replicate_video) %>%
summarise(median_body_size = median(mean_area))
datatable(ds_median_body_size,
rownames = FALSE,
options = list(scrollX = TRUE),
filter = list(position = 'top',
clear = FALSE))
ds_lnRR_median_body_size)eco_metaeco_types = unique(culture_info$eco_metaeco_type)
single_row = NULL
row_n = 0
for (disturbance_input in c("low", "high")){
for (eco_metaeco_input in eco_metaeco_types){
for (time_point_input in 0:7){
row_n = row_n + 1
single_row[[row_n]] = ds_median_body_size %>%
filter(eco_metaeco_type == eco_metaeco_input) %>%
filter(disturbance == disturbance_input) %>%
filter(time_point == time_point_input) %>%
group_by(culture_ID, eco_metaeco_type, patch_size, disturbance, time_point, day) %>%
summarise(median_body_size_across_videos = mean(median_body_size)) %>%
group_by(eco_metaeco_type, patch_size, disturbance, time_point, day) %>%
summarise(mean_median_body_size = mean(median_body_size_across_videos))}}}
ds_lnRR_median_body_size = single_row %>%
bind_rows()
for (patch_size_input in c("S", "M", "L")){
for (disturbance_input in c("low", "high")){
for (time_point_input in 0:7){
averaged_value_isolated_control = ds_lnRR_median_body_size %>%
filter(eco_metaeco_type == patch_size_input) %>%
filter(disturbance == disturbance_input) %>%
filter(time_point == time_point_input) %>%
ungroup() %>%
select(mean_median_body_size)
ds_lnRR_median_body_size$isolated_control[
ds_lnRR_median_body_size$patch_size == patch_size_input &
ds_lnRR_median_body_size$disturbance == disturbance_input &
ds_lnRR_median_body_size$time_point == time_point_input] =
averaged_value_isolated_control}}}
ds_lnRR_median_body_size = ds_lnRR_median_body_size %>%
mutate(isolated_control = as.numeric(isolated_control)) %>%
mutate(lnRR_median_body_size = ln(mean_median_body_size / isolated_control))
saveRDS(ds_lnRR_median_body_size, file = here("results", "ds_lnRR_median_body_size.RData"))
ds_lnRR_median_body_size = readRDS(here("results", "ds_lnRR_median_body_size.RData"))
datatable(ds_lnRR_median_body_size,
rownames = FALSE,
options = list(scrollX = TRUE),
filter = list(position = 'top',
clear = FALSE))
I need to assemble a dataset with the 75% and 95% IQR abundances.
#How do you calculate the quantiles in dplyr?
disturbance_input = "low"
eco_metaeco_input = "S (S_S)"
time_point_input = 4
ds_body_size %>%
filter(disturbance == disturbance_input) %>%
filter(eco_metaeco_type == eco_metaeco_input) %>%
filter(time_point == time_point_input) %>%
filter(culture_ID == 16) %>%
filter(replicate_video == 1) #%>%
## culture_ID system_nr disturbance time_point day patch_size patch_size_volume
## 1 16 16 low 4 16 S 7.5
## 2 16 16 low 4 16 S 7.5
## 3 16 16 low 4 16 S 7.5
## 4 16 16 low 4 16 S 7.5
## 5 16 16 low 4 16 S 7.5
## 6 16 16 low 4 16 S 7.5
## 7 16 16 low 4 16 S 7.5
## 8 16 16 low 4 16 S 7.5
## 9 16 16 low 4 16 S 7.5
## 10 16 16 low 4 16 S 7.5
## 11 16 16 low 4 16 S 7.5
## 12 16 16 low 4 16 S 7.5
## 13 16 16 low 4 16 S 7.5
## metaecosystem metaecosystem_type eco_metaeco_type replicate_video mean_area
## 1 yes S_S S (S_S) 1 3701.1204
## 2 yes S_S S (S_S) 1 8369.0602
## 3 yes S_S S (S_S) 1 8659.1141
## 4 yes S_S S (S_S) 1 2993.2219
## 5 yes S_S S (S_S) 1 695.1015
## 6 yes S_S S (S_S) 1 8617.9950
## 7 yes S_S S (S_S) 1 2539.5349
## 8 yes S_S S (S_S) 1 9167.3573
## 9 yes S_S S (S_S) 1 773.4804
## 10 yes S_S S (S_S) 1 3161.5819
## 11 yes S_S S (S_S) 1 745.1421
## 12 yes S_S S (S_S) 1 3443.2128
## 13 yes S_S S (S_S) 1 2724.6375
#filter(mean_area > quantile(mean_area, 0.125), mean_area < quantile(x, 0.125))
Research question: how does the total biomass of meta-ecosystem changes across meta-ecosystems with different patch sizes?
for (disturbance_input in c("low", "high")){
print(ds_regional_biomass %>%
filter(disturbance == disturbance_input) %>%
filter(!metaecosystem_type == "S_L_from_isolated") %>%
ggplot(aes(x = day,
y = total_regional_bioarea,
group = interaction(day, metaecosystem_type),
fill = metaecosystem_type)) +
geom_boxplot() +
labs(title = paste("Disturbance =", disturbance_input),
x = "Day",
y = "Regional bioarea (µm²)",
fill = "") +
theme_bw() +
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
legend.position = c(.95, .95),
legend.justification = c("right", "top"),
legend.box.just = "right",
legend.margin = margin(6, 6, 6, 6)) +
scale_fill_discrete(labels = c("large-large",
"medium-medium",
"small-large",
"small-small")) +
geom_vline(xintercept = first_perturbation_day + 0.5,
linetype="dotdash",
color = "grey",
size=0.7) +
labs(caption = "Vertical grey line: first perturbation"))}
Research question: do meta-ecosystems with the same total size but with patches that are either the same size or of different size have a different biomass density? (i.e., do the medium-medium and small-large meta-ecosystems have different biomass density?)
for (disturbance_input in c("low", "high")){
print(ds_regional_biomass %>%
filter ( disturbance == disturbance_input) %>%
filter (metaecosystem_type == "S_L" |
metaecosystem_type == "M_M") %>%
ggplot (aes(x = day,
y = total_regional_bioarea,
group = system_nr,
fill = system_nr,
color = system_nr,
linetype = metaecosystem_type)) +
geom_line () +
labs(title = paste("Disturbance =", disturbance_input),
x = "Day",
y = "Regional bioarea (µm²)",
fill = "System nr",
color = "System nr",
linetype = "") +
scale_x_continuous(limits = c(-2, 30)) +
scale_linetype_discrete(labels = c("medium-medium",
"small-large")) +
theme_bw() +
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
legend.position = c(.95, .95),
legend.justification = c("right", "top"),
legend.box.just = "right",
legend.margin = margin(6, 6, 6, 6)) +
geom_vline(xintercept = first_perturbation_day,
linetype = "dotdash",
color = "grey",
size = 0.7) +
labs(caption = "Vertical grey line: first perturbation"))}
for (disturbance_input in c("low", "high")){
print(ds_regional_biomass %>%
filter(disturbance == disturbance_input) %>%
filter (metaecosystem_type == "S_L" |
metaecosystem_type == "M_M") %>%
ggplot (aes(x = day,
y = total_regional_bioarea,
group = interaction(day, metaecosystem_type),
fill = metaecosystem_type)) +
geom_boxplot() +
labs(title = paste("Disturbance =", disturbance_input),
x = "Day",
y = "Regional bioarea (µm²)",
color = '',
fill = '') +
scale_fill_discrete(labels = c("medium-medium",
"small-large")) +
theme_bw() +
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
legend.position = c(.95, .95),
legend.justification = c("right", "top"),
legend.box.just = "right",
legend.margin = margin(6, 6, 6, 6)) +
geom_vline(xintercept = first_perturbation_day + 0.7,
linetype = "dotdash",
color = "grey",
size = 0.7) +
labs(caption = "Vertical grey line: first perturbation"))}
How does the biomass density of medium-medium and small-large meta-ecosystems differ across the time series? (The first two points before the first disturbance are taken off).
We will exclude in the model the time point 0 and 1. At time point 0 all cultures were the same because that’s how we made them. At time point 1 no disturbance event had already taken place.
Let’s see how linear is the time trend of bioarea and if we can make it more linear with a log10 transformation. We are lucky that during the modelling process we need to drop the first two time points, which would have made the biomass trend not linear.
Linearity of regional bioarea ~ time
ds_regional_biomass %>%
filter(time_point >= 2) %>%
filter(!metaecosystem_type == "S_L_from_isolated") %>%
ggplot(aes(x = day,
y = total_regional_bioarea,
group = day)) +
geom_boxplot() +
labs(x = "Day",
y = "Regional bioarea (µm²)")
linear_model = lm(total_regional_bioarea ~
day,
data = ds_regional_biomass %>%
filter(time_point >= 2) %>%
filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"))
par(mfrow=c(2,3))
plot(linear_model, which = 1:5)
Model selection
Let’ start from the full model.
\[ Total \: Regional \: Bioarea = t + M + D + tM + tD + MD + tDM + (t | system \: nr) \]
full = lmer(total_regional_bioarea ~
day * metaecosystem_type * disturbance +
(day | system_nr),
data = ds_regional_biomass %>%
filter(time_point >= 2) %>%
filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"),
REML = FALSE,
control = lmerControl(optimizer = "Nelder_Mead"))
Should we keep M (metaecosystem_type)?
no_M = lmer(total_regional_bioarea ~
day * disturbance +
(day | system_nr),
data = ds_regional_biomass %>%
filter(time_point >= 2) %>%
filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"),
REML = FALSE,
control = lmerControl(optimizer = "Nelder_Mead"))
anova(full, no_M)
## Data: ds_regional_biomass %>% filter(time_point >= 2) %>% filter(metaecosystem_type == ...
## Models:
## no_M: total_regional_bioarea ~ day * disturbance + (day | system_nr)
## full: total_regional_bioarea ~ day * metaecosystem_type * disturbance + (day | system_nr)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## no_M 8 2788.5 2810.8 -1386.3 2772.5
## full 12 2786.3 2819.8 -1381.2 2762.3 10.22 4 0.03688 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Yes.
Should we keep D (disturbance)?
no_D = lmer(total_regional_bioarea ~
day * metaecosystem_type +
(day | system_nr),
data = ds_regional_biomass %>%
filter(time_point >= 2) %>%
filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"),
REML = FALSE,
control = lmerControl(optimizer = "Nelder_Mead"))
anova(full, no_D)
## Data: ds_regional_biomass %>% filter(time_point >= 2) %>% filter(metaecosystem_type == ...
## Models:
## no_D: total_regional_bioarea ~ day * metaecosystem_type + (day | system_nr)
## full: total_regional_bioarea ~ day * metaecosystem_type * disturbance + (day | system_nr)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## no_D 8 2792.9 2815.2 -1388.4 2776.9
## full 12 2786.3 2819.8 -1381.2 2762.3 14.584 4 0.005648 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Yes.
Should we keep the correlation in
(day | system_nr)?
no_correlation = lmer(total_regional_bioarea ~
day * metaecosystem_type * disturbance +
(day || system_nr),
data = ds_regional_biomass %>%
filter(time_point >= 2) %>%
filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"),
REML = FALSE,
control = lmerControl(optimizer = "Nelder_Mead"))
anova(full, no_correlation)
## Data: ds_regional_biomass %>% filter(time_point >= 2) %>% filter(metaecosystem_type == ...
## Models:
## no_correlation: total_regional_bioarea ~ day * metaecosystem_type * disturbance + ((1 | system_nr) + (0 + day | system_nr))
## full: total_regional_bioarea ~ day * metaecosystem_type * disturbance + (day | system_nr)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## no_correlation 11 2785.8 2816.5 -1381.9 2763.8
## full 12 2786.3 2819.8 -1381.2 2762.3 1.5333 1 0.2156
No.
Should we keep the random effect of system nr on the time slopes
(day | system_nr)?
no_random_slopes = lmer(total_regional_bioarea ~
day * metaecosystem_type * disturbance +
(1 | system_nr),
data = ds_regional_biomass %>%
filter(time_point >= 2) %>%
filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"),
REML = FALSE,
control = lmerControl(optimizer = "Nelder_Mead"))
anova(no_correlation, no_random_slopes)
## Data: ds_regional_biomass %>% filter(time_point >= 2) %>% filter(metaecosystem_type == ...
## Models:
## no_random_slopes: total_regional_bioarea ~ day * metaecosystem_type * disturbance + (1 | system_nr)
## no_correlation: total_regional_bioarea ~ day * metaecosystem_type * disturbance + ((1 | system_nr) + (0 + day | system_nr))
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## no_random_slopes 10 2783.8 2811.7 -1381.9 2763.8
## no_correlation 11 2785.8 2816.5 -1381.9 2763.8 0 1 1
No.
Should we keep t * M * D?
no_threeway = lmer(total_regional_bioarea ~
day +
metaecosystem_type +
disturbance +
day : metaecosystem_type +
day : disturbance +
metaecosystem_type : disturbance +
(1 | system_nr),
data = ds_regional_biomass %>%
filter(time_point >= 2) %>%
filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"),
REML = FALSE,
control = lmerControl(optimizer = 'optimx',
optCtrl = list(method = 'L-BFGS-B')))
anova(no_random_slopes, no_threeway)
## Data: ds_regional_biomass %>% filter(time_point >= 2) %>% filter(metaecosystem_type == ...
## Models:
## no_threeway: total_regional_bioarea ~ day + metaecosystem_type + disturbance + day:metaecosystem_type + day:disturbance + metaecosystem_type:disturbance + (1 | system_nr)
## no_random_slopes: total_regional_bioarea ~ day * metaecosystem_type * disturbance + (1 | system_nr)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## no_threeway 9 2781.9 2807.0 -1382.0 2763.9
## no_random_slopes 10 2783.8 2811.7 -1381.9 2763.8 0.0948 1 0.7582
No.
Should we keep t * M?
no_TM = lmer(total_regional_bioarea ~
day +
metaecosystem_type +
disturbance +
day : disturbance +
metaecosystem_type : disturbance +
(1 | system_nr),
data = ds_regional_biomass %>%
filter(time_point >= 2) %>%
filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"),
REML = FALSE,
control = lmerControl(optimizer = "Nelder_Mead"))
anova(no_threeway,no_TM)
## Data: ds_regional_biomass %>% filter(time_point >= 2) %>% filter(metaecosystem_type == ...
## Models:
## no_TM: total_regional_bioarea ~ day + metaecosystem_type + disturbance + day:disturbance + metaecosystem_type:disturbance + (1 | system_nr)
## no_threeway: total_regional_bioarea ~ day + metaecosystem_type + disturbance + day:metaecosystem_type + day:disturbance + metaecosystem_type:disturbance + (1 | system_nr)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## no_TM 8 2787.3 2809.6 -1385.7 2771.3
## no_threeway 9 2781.9 2807.0 -1382.0 2763.9 7.3941 1 0.006544 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Yes.
Should we keep t * D?
no_TD = lmer(total_regional_bioarea ~
day +
metaecosystem_type +
disturbance +
day : metaecosystem_type +
metaecosystem_type : disturbance +
(1 | system_nr),
data = ds_regional_biomass %>%
filter(time_point >= 2) %>%
filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"),
REML = FALSE,
control = lmerControl(optimizer = "Nelder_Mead"))
anova(no_threeway, no_TD)
## Data: ds_regional_biomass %>% filter(time_point >= 2) %>% filter(metaecosystem_type == ...
## Models:
## no_TD: total_regional_bioarea ~ day + metaecosystem_type + disturbance + day:metaecosystem_type + metaecosystem_type:disturbance + (1 | system_nr)
## no_threeway: total_regional_bioarea ~ day + metaecosystem_type + disturbance + day:metaecosystem_type + day:disturbance + metaecosystem_type:disturbance + (1 | system_nr)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## no_TD 8 2779.9 2802.2 -1382 2763.9
## no_threeway 9 2781.9 2807.0 -1382 2763.9 0.021 1 0.8847
No.
Should we keep M * D?
no_MD = lmer(total_regional_bioarea ~
day +
metaecosystem_type +
disturbance +
day : metaecosystem_type +
(1 | system_nr),
data = ds_regional_biomass %>%
filter(time_point >= 2) %>%
filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"),
REML = FALSE,
control = lmerControl(optimizer = "Nelder_Mead"))
anova(no_TD, no_MD)
## Data: ds_regional_biomass %>% filter(time_point >= 2) %>% filter(metaecosystem_type == ...
## Models:
## no_MD: total_regional_bioarea ~ day + metaecosystem_type + disturbance + day:metaecosystem_type + (1 | system_nr)
## no_TD: total_regional_bioarea ~ day + metaecosystem_type + disturbance + day:metaecosystem_type + metaecosystem_type:disturbance + (1 | system_nr)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## no_MD 7 2778.3 2797.8 -1382.2 2764.3
## no_TD 8 2779.9 2802.2 -1382.0 2763.9 0.3541 1 0.5518
No.
Best model
Therefore, our best model is:
\[ Regional \: bioarea = t + M + D + tM + (1 | system \: nr) \]
best_model = no_MD
Let’s do some model diagnostics:
plot(best_model)
qqnorm(resid(best_model))
The R squared of this model for t2-t7 are:
R2_marginal = r.squaredGLMM(best_model)[1]
R2_marginal = round(R2_marginal, digits = 2)
R2_conditional = r.squaredGLMM(best_model)[2]
R2_conditional = round(R2_conditional, digits = 2)
Marginal R2 = 0.76
Conditional R2 = 0.78
To calculate the R of the single explaining variables, let’s consider its inclusive R squares (see Stoffel, Nakagawa, and Schielzeth (2021) ). These contain the variance of a variable including its collinearity with other variables. I chose them because there is no collinearity between meta-ecosystem type, disturbance, and day (they were experimentally crossed). Check the other type of R squared you can calculate and whether you should check if they give you the same result. The inclusive R squares of this model are:
R2_regional = partR2(best_model,
partvars = c("day",
"metaecosystem_type",
"disturbance"),
R2_type = "conditional",
nboot = 1000,
CI = 0.95)
saveRDS(R2_regional, file = here("results", "biomass", "R2_regional_t2t7.RData"))
R2_regional = readRDS(here("results", "biomass", "R2_regional_t2t7.RData"))
R2_regional$IR2
## # A tibble: 4 × 4
## term estimate CI_lower CI_upper
## <chr> <dbl> <dbl> <dbl>
## 1 day 0.684 0.605 0.758
## 2 metaecosystem_typeS_L 0.0232 0.00311 0.0663
## 3 disturbancelow 0.0543 0.0153 0.111
## 4 day:metaecosystem_typeS_L 0.137 0.0773 0.215
t2 - t5
Let’s just assume that this model holds also for t2-t5. Then, let’s recalculate the R squared.
t2_t5 = lmer(total_regional_bioarea ~
day +
metaecosystem_type +
disturbance +
day : metaecosystem_type +
(1 | system_nr),
data = ds_regional_biomass %>%
filter(time_point >= 2) %>%
filter(time_point <= 5) %>%
filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"),
REML = FALSE,
control = lmerControl(optimizer = "Nelder_Mead"))
plot(t2_t5)
qqnorm(resid(t2_t5))
R2_marginal = r.squaredGLMM(t2_t5)[1]
R2_marginal = round(R2_marginal, digits = 2)
R2_conditional = r.squaredGLMM(t2_t5)[2]
R2_conditional = round(R2_conditional, digits = 2)
The R squared of this model for t2-t5 are:
Marginal R2 = 0.6
Conditional R2 = 0.61
The inclusive R squares of this model are:
R2_regional = partR2(t2_t5,
partvars = c("day",
"metaecosystem_type",
"disturbance"),
R2_type = "conditional",
nboot = 1000,
CI = 0.95)
saveRDS(R2_regional, file = here("results", "biomass", "R2_regional_t2t5.RData"))
R2_regional = readRDS(here("results", "biomass", "R2_regional_t2t5.RData"))
R2_regional$IR2
## # A tibble: 4 × 4
## term estimate CI_lower CI_upper
## <chr> <dbl> <dbl> <dbl>
## 1 day 0.462 0.334 0.603
## 2 metaecosystem_typeS_L 0.0589 0.0103 0.153
## 3 disturbancelow 0.0770 0.0161 0.178
## 4 day:metaecosystem_typeS_L 0.147 0.0615 0.275
How does the biomass density of medium-medium and small-large meta-ecosystems differ for each time point? (The first two points before the first disturbance are taken off).
Let’s now look at the full model and see if we should keep the interaction between meta-ecosystem type and disturbance. We are not using mixed effects because a certain system nr can’t be at different perturbations or at different meta-ecosystem types.
\[ Total \: Regional \: Bioarea = M + D + MD \]
Let’s start by looking at whether meta-ecosystem type and disturbance had an effect at time point = 1. At this time point, no disturbance event had yet occurred. Therefore, we would not expect an effect of disturbance. In regards to meta-ecosystem type, there might be an effect if it comes from just the sizes of the two ecosystems.
chosen_time_point = 1
full = lm(total_regional_bioarea ~
metaecosystem_type +
disturbance +
metaecosystem_type * disturbance,
data = ds_regional_biomass %>%
filter(time_point == chosen_time_point) %>%
filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"))
Does disturbance have an effect?
no_D = lm(total_regional_bioarea ~
metaecosystem_type +
metaecosystem_type * disturbance,
data = ds_regional_biomass %>%
filter(time_point == chosen_time_point) %>%
filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"))
AIC(full,no_D)
## df AIC
## full 5 469.0232
## no_D 5 469.0232
No.
Does meta-ecosystem type have an effect?
no_M = lm(total_regional_bioarea ~
disturbance,
data = ds_regional_biomass %>%
filter(time_point == chosen_time_point) %>%
filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"))
AIC(full,no_M)
## df AIC
## full 5 469.0232
## no_M 3 466.9816
No.
chosen_time_point = 2
full = lm(total_regional_bioarea ~
metaecosystem_type +
disturbance +
metaecosystem_type * disturbance,
data = ds_regional_biomass %>%
filter(time_point == chosen_time_point) %>%
filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"))
Should we keep M?
no_M = lm(total_regional_bioarea ~
disturbance +
metaecosystem_type * disturbance,
data = ds_regional_biomass %>%
filter(time_point == chosen_time_point) %>%
filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"))
AIC(full,no_M)
## df AIC
## full 5 482.5831
## no_M 5 482.5831
Should we keep D?
no_D = lm(total_regional_bioarea ~
metaecosystem_type +
metaecosystem_type * disturbance,
data = ds_regional_biomass %>%
filter(time_point == chosen_time_point) %>%
filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"))
AIC(full,no_D)
## df AIC
## full 5 482.5831
## no_D 5 482.5831
Should we keep M * D?
no_MD = lm(total_regional_bioarea ~
metaecosystem_type +
disturbance,
data = ds_regional_biomass %>%
filter(time_point == chosen_time_point) %>%
filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"))
AIC(full,no_MD)
## df AIC
## full 5 482.5831
## no_MD 4 481.5121
No.
best_model = no_MD
par(mfrow=c(2,3))
plot(best_model, which = 1:5)
R2_full = glance(best_model)$r.squared
no_M = lm(total_regional_bioarea ~
disturbance,
data = ds_regional_biomass %>%
filter(time_point == chosen_time_point) %>%
filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"))
R2_no_M = glance(no_M)$r.squared
R2_M = R2_full - R2_no_M
R2_full = round(R2_full, digits = 2)
R2_M = round(R2_M, digits = 2)
The adjusted R squared of the model is 0.17 and the adjusted R squared of patch type is 0.16.
chosen_time_point = 3
full = lm(total_regional_bioarea ~
metaecosystem_type +
disturbance +
metaecosystem_type * disturbance,
data = ds_regional_biomass %>%
filter(time_point == chosen_time_point) %>%
filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"))
Should we keep M?
no_M = lm(total_regional_bioarea ~
disturbance +
metaecosystem_type * disturbance,
data = ds_regional_biomass %>%
filter(time_point == chosen_time_point) %>%
filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"))
AIC(full,no_M)
## df AIC
## full 5 467.0762
## no_M 5 467.0762
Should we keep D?
no_D = lm(total_regional_bioarea ~
metaecosystem_type +
metaecosystem_type * disturbance,
data = ds_regional_biomass %>%
filter(time_point == chosen_time_point) %>%
filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"))
AIC(full,no_D)
## df AIC
## full 5 467.0762
## no_D 5 467.0762
Should we keep M * D?
no_MD = lm(total_regional_bioarea ~
metaecosystem_type +
disturbance,
data = ds_regional_biomass %>%
filter(time_point == chosen_time_point) %>%
filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"))
AIC(full,no_MD)
## df AIC
## full 5 467.0762
## no_MD 4 468.2493
Yes.
best_model = full
par(mfrow=c(2,3))
plot(best_model, which = 1:5)
R2_full = glance(best_model)$r.squared
no_M = lm(total_regional_bioarea ~
disturbance,
data = ds_regional_biomass %>%
filter(time_point == chosen_time_point) %>%
filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"))
R2_no_M = glance(no_M)$r.squared
R2_M = R2_full - R2_no_M
R2_full = round(R2_full, digits = 2)
R2_M = round(R2_M, digits = 2)
The adjusted R squared of the model is 0.61 and the adjusted R squared of patch type is 0.36 (which includes also the interaction with disturbance).
chosen_time_point = 4
full = lm(total_regional_bioarea ~
metaecosystem_type +
disturbance +
metaecosystem_type * disturbance,
data = ds_regional_biomass %>%
filter(time_point == chosen_time_point) %>%
filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"))
Should we keep M?
no_M = lm(total_regional_bioarea ~
disturbance +
metaecosystem_type * disturbance,
data = ds_regional_biomass %>%
filter(time_point == chosen_time_point) %>%
filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"))
AIC(full,no_M)
## df AIC
## full 5 472.5856
## no_M 5 472.5856
Should we keep D?
no_D = lm(total_regional_bioarea ~
metaecosystem_type +
metaecosystem_type * disturbance,
data = ds_regional_biomass %>%
filter(time_point == chosen_time_point) %>%
filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"))
AIC(full,no_D)
## df AIC
## full 5 472.5856
## no_D 5 472.5856
Should we keep M * D?
no_MD = lm(total_regional_bioarea ~
metaecosystem_type +
disturbance,
data = ds_regional_biomass %>%
filter(time_point == chosen_time_point) %>%
filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"))
AIC(full,no_MD)
## df AIC
## full 5 472.5856
## no_MD 4 471.0777
No.
best_model = no_MD
par(mfrow=c(2,3))
plot(best_model, which = 1:5)
R2_full = glance(best_model)$r.squared
no_M = lm(total_regional_bioarea ~
disturbance,
data = ds_regional_biomass %>%
filter(time_point == chosen_time_point) %>%
filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"))
R2_no_M = glance(no_M)$r.squared
R2_M = R2_full - R2_no_M
R2_full = round(R2_full, digits = 2)
R2_M = round(R2_M, digits = 2)
The adjusted R squared of the model is 0.21 and the adjusted R squared of patch type is 0.02.
chosen_time_point = 5
full = lm(total_regional_bioarea ~
metaecosystem_type +
disturbance +
metaecosystem_type * disturbance,
data = ds_regional_biomass %>%
filter(time_point == chosen_time_point) %>%
filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"))
Should we keep M?
no_M = lm(total_regional_bioarea ~
disturbance +
metaecosystem_type * disturbance,
data = ds_regional_biomass %>%
filter(time_point == chosen_time_point) %>%
filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"))
AIC(full,no_M)
## df AIC
## full 5 466.1591
## no_M 5 466.1591
Should we keep D?
no_D = lm(total_regional_bioarea ~
metaecosystem_type +
metaecosystem_type * disturbance,
data = ds_regional_biomass %>%
filter(time_point == chosen_time_point) %>%
filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"))
AIC(full,no_D)
## df AIC
## full 5 466.1591
## no_D 5 466.1591
Should we keep M * D?
no_MD = lm(total_regional_bioarea ~
metaecosystem_type +
disturbance,
data = ds_regional_biomass %>%
filter(time_point == chosen_time_point) %>%
filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"))
AIC(full,no_MD)
## df AIC
## full 5 466.1591
## no_MD 4 464.1787
No.
best_model = no_MD
par(mfrow=c(2,3))
plot(best_model, which = 1:5)
R2_full = glance(best_model)$r.squared
no_M = lm(total_regional_bioarea ~
disturbance,
data = ds_regional_biomass %>%
filter(time_point == chosen_time_point) %>%
filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"))
R2_no_M = glance(no_M)$r.squared
R2_M = R2_full - R2_no_M
R2_full = round(R2_full, digits = 2)
R2_M = round(R2_M, digits = 2)
The adjusted R squared of the model is 0.31 and the adjusted R squared of patch type is 0.02.
chosen_time_point = 6
full = lm(total_regional_bioarea ~
metaecosystem_type +
disturbance +
metaecosystem_type * disturbance,
data = ds_regional_biomass %>%
filter(time_point == chosen_time_point) %>%
filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"))
Should we keep M?
no_M = lm(total_regional_bioarea ~
disturbance +
metaecosystem_type * disturbance,
data = ds_regional_biomass %>%
filter(time_point == chosen_time_point) %>%
filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"))
AIC(full,no_M)
## df AIC
## full 5 439.5165
## no_M 5 439.5165
Should we keep D?
no_D = lm(total_regional_bioarea ~
metaecosystem_type +
metaecosystem_type * disturbance,
data = ds_regional_biomass %>%
filter(time_point == chosen_time_point) %>%
filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"))
AIC(full,no_D)
## df AIC
## full 5 439.5165
## no_D 5 439.5165
Should we keep M * D?
no_MD = lm(total_regional_bioarea ~
metaecosystem_type +
disturbance,
data = ds_regional_biomass %>%
filter(time_point == chosen_time_point) %>%
filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"))
AIC(full,no_MD)
## df AIC
## full 5 439.5165
## no_MD 4 438.0414
No.
best_model = no_MD
par(mfrow=c(2,3))
plot(best_model, which = 1:5)
R2_full = glance(best_model)$r.squared
no_M = lm(total_regional_bioarea ~
disturbance,
data = ds_regional_biomass %>%
filter(time_point == chosen_time_point) %>%
filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"))
R2_no_M = glance(no_M)$r.squared
R2_M = R2_full - R2_no_M
R2_full = round(R2_full, digits = 2)
R2_M = round(R2_M, digits = 2)
The adjusted R squared of the model is 0.45 and the adjusted R squared of patch type is 0.
chosen_time_point = 7
full = lm(total_regional_bioarea ~
metaecosystem_type +
disturbance +
metaecosystem_type * disturbance,
data = ds_regional_biomass %>%
filter(time_point == chosen_time_point) %>%
filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"))
Should we keep M?
no_M = lm(total_regional_bioarea ~
disturbance +
metaecosystem_type * disturbance,
data = ds_regional_biomass %>%
filter(time_point == chosen_time_point) %>%
filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"))
AIC(full,no_M)
## df AIC
## full 5 427.4663
## no_M 5 427.4663
Should we keep D?
no_D = lm(total_regional_bioarea ~
metaecosystem_type +
metaecosystem_type * disturbance,
data = ds_regional_biomass %>%
filter(time_point == chosen_time_point) %>%
filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"))
AIC(full,no_D)
## df AIC
## full 5 427.4663
## no_D 5 427.4663
Should we keep M * D?
no_MD = lm(total_regional_bioarea ~
metaecosystem_type +
disturbance,
data = ds_regional_biomass %>%
filter(time_point == chosen_time_point) %>%
filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"))
AIC(full,no_MD)
## df AIC
## full 5 427.4663
## no_MD 4 425.6015
No.
best_model = no_MD
par(mfrow=c(2,3))
plot(best_model, which = 1:5)
R2_full = glance(best_model)$r.squared
no_M = lm(total_regional_bioarea ~
disturbance,
data = ds_regional_biomass %>%
filter(time_point == chosen_time_point) %>%
filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"))
R2_no_M = glance(no_M)$r.squared
R2_M = R2_full - R2_no_M
R2_full = round(R2_full, digits = 2)
R2_M = round(R2_M, digits = 2)
The adjusted R squared of the model is 0.59 and the adjusted R squared of patch type is 0.02.
Research question: how does the biomass density of meta-ecosystems change according to the size of their patches?
for (disturbance_input in c("low", "high")){
print(ds_regional_biomass %>%
filter (disturbance == disturbance_input) %>%
filter(!metaecosystem_type == "S_L") %>%
filter(!metaecosystem_type == "S_L_from_isolated") %>%
ggplot (aes(x = day,
y = total_regional_bioarea,
group = system_nr,
fill = system_nr,
color = system_nr,
linetype = metaecosystem_type)) +
geom_line () +
labs(title = paste("Disturbance =", disturbance_input),
x = "Day",
y = "Regional bioarea (µm²)",
fill = "System nr",
linetype = "") +
scale_colour_continuous(guide = "none") +
theme_bw() +
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
legend.position = c(.95, .95),
legend.justification = c("right", "top"),
legend.box.just = "right",
legend.margin = margin(6, 6, 6, 6)) +
scale_linetype_discrete(labels = c("large-large",
"medium-medium",
"small-small")) +
geom_vline(xintercept = first_perturbation_day,
linetype = "dotdash",
color = "grey",
size = 0.7) +
labs(caption = "Vertical grey line: first perturbation"))}
for (disturbance_input in c("low", "high")){
print(ds_regional_biomass %>%
filter(disturbance == disturbance_input) %>%
filter(!metaecosystem_type == "S_L") %>%
filter(!metaecosystem_type == "S_L_from_isolated") %>%
ggplot(aes(x = day,
y = total_regional_bioarea,
group = interaction(day, metaecosystem_type),
fill = metaecosystem_type)) +
geom_boxplot() +
labs(title = "Disturbance = low",
x = "Day",
y = "Regional bioarea (µm²)",
fill = "") +
theme_bw() +
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
legend.position = c(.95, .95),
legend.justification = c("right", "top"),
legend.box.just = "right",
legend.margin = margin(6, 6, 6, 6)) +
scale_fill_discrete(labels = c("large-large",
"medium-medium",
"small-small")) +
geom_vline(xintercept = first_perturbation_day + 0.7,
linetype = "dotdash",
color = "grey",
size = 0.7) +
labs(caption = "Vertical grey line: first perturbation"))}
I’m not sure how I would model it because I would have three meta-ecosystem types. Better not to model it, as I don’t see a reason to do it.
I don’t see a reason to model how total bioarea changes across (i) small-small, (ii) medium-medium, and (iii) large-large meta-ecosystems.
Research question: how does the density of bioarea changes across patch types?
for (disturbance_input in c("low", "high")) {
print(ds_biomass_abund %>%
group_by(culture_ID, disturbance, day, eco_metaeco_type) %>%
summarise(bioarea_per_volume_video_averaged = mean(bioarea_per_volume)) %>%
filter(disturbance == disturbance_input) %>%
ggplot(aes(x = day,
y = bioarea_per_volume_video_averaged,
group = interaction(day, eco_metaeco_type),
fill = eco_metaeco_type)) +
geom_boxplot() +
labs(title = paste("Disturbance =", disturbance_input),
x = "Day",
y = "Local bioarea (µm²/µl)",
fill = "") +
theme_bw() +
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
# legend.position = c(.99, .999),
# legend.justification = c("right", "top"),
# legend.box.just = "right",
legend.margin = margin(6, 6, 6, 6)) +
scale_fill_discrete(labels = c("large isolated",
"large connected to large",
"large connected to small",
"medium isolated",
"medium connected to medium",
"small isolated",
"small connected to large",
"small connected to small")) +
geom_vline(xintercept = first_perturbation_day + 0.6,
linetype="dotdash",
color = "grey",
size=0.7) +
labs(caption = "Vertical grey line: first perturbation"))}
Research question: how does total bioarea changes across patch types?
for (disturbance_input in c("low", "high")) {
print(ds_biomass_abund %>%
filter(disturbance == disturbance_input) %>%
group_by(culture_ID, system_nr, disturbance, time_point, day, patch_size, patch_size_volume, eco_metaeco_type) %>%
summarise(bioarea_tot_video_averaged = mean(bioarea_tot)) %>%
ggplot(aes(x = day,
y = bioarea_tot_video_averaged,
group = interaction(day, eco_metaeco_type),
fill = eco_metaeco_type)) +
geom_boxplot() +
labs(title = paste("Disturbance =", disturbance_input),
x = "Day",
y = "Total patch bioarea (µm²)",
fill = "") +
theme_bw() +
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
# legend.position = c(.99, .999),
# legend.justification = c("right", "top"),
# legend.box.just = "right",
legend.margin = margin(6, 6, 6, 6)) +
scale_fill_discrete(labels = c("large isolated",
"large connected to large",
"large connected to small",
"medium isolated",
"medium connected to medium",
"small isolated",
"small connected to large",
"small connected to small")) +
geom_vline(xintercept = first_perturbation_day + 0.6,
linetype="dotdash",
color = "grey",
size=0.7) +
labs(caption = "Vertical grey line: first perturbation"))}
Research question: how does biomass density change according to the size to which the patch is connected? (Does a small patch connected to a small patch have the same biomass density than a small patch connected to a large patch?)
for (disturbance_input in c("low", "high")) {
print(ds_biomass_abund %>%
filter(disturbance == disturbance_input) %>%
filter(patch_size == "S") %>%
ggplot(aes(x = day,
y = bioarea_per_volume,
group = culture_ID,
fill = culture_ID,
color = culture_ID,
linetype = eco_metaeco_type)) +
geom_line(stat = "summary", fun = "mean") +
labs(title = paste("Disturbance =", disturbance_input),
x = "Day",
y = "Local bioarea (µm²/μl)",
linetype = "") +
theme_bw() +
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
legend.position = c(.95, .95),
legend.justification = c("right", "top"),
legend.box.just = "right",
legend.margin = margin(6, 6, 6, 6)) +
scale_linetype_discrete(labels = c("small isolated",
"small connected to small",
"small connected to large")) +
geom_vline(xintercept = first_perturbation_day,
linetype = "dotdash",
color = "grey",
size = 0.7) +
labs(caption = "Vertical grey line: first perturbation"))}
for (disturbance_input in c("low", "high")) {
print(ds_biomass_abund %>%
filter(disturbance == disturbance_input) %>%
filter(patch_size == "S") %>%
ggplot(aes(x = day,
y = bioarea_per_volume,
group = interaction(day,eco_metaeco_type),
fill = eco_metaeco_type)) +
geom_boxplot() +
labs(title = paste("Disturbance =", disturbance_input),
x = "Day",
y = "Local bioarea (µm²/μl)",
fill = "") +
theme_bw() +
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
legend.position = c(.95, .95),
legend.justification = c("right", "top"),
legend.box.just = "right",
legend.margin = margin(6, 6, 6, 6)) +
scale_fill_discrete(labels = c("small isolated",
"small connected to small",
"small connected to large")) +
geom_vline(xintercept = first_perturbation_day + 0.7,
linetype = "dotdash",
color = "grey",
size = 0.7) +
labs(caption = "Vertical grey line: first perturbation"))}
for (disturbance_input in c("low", "high")) {
print(ds_effect_size_bioarea_density %>%
filter(!time_point == 0) %>% #At time point 0 all cultures were the same
filter(disturbance == disturbance_input) %>%
filter(eco_metaeco_type == "S (S_S)" | eco_metaeco_type == "S (S_L)") %>%
ggplot(aes(x = day,
y = bioarea_density_lnRR,
color = eco_metaeco_type)) +
geom_point(position = position_dodge(0.5)) +
geom_line(position = position_dodge(0.5)) +
labs(title = paste("Disturbance =", disturbance_input),
x = "Day",
y = "LnRR bioarea density",
color = "") +
scale_color_discrete(labels = c("small connected to large",
"small connnected to small")) +
theme_bw() +
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
legend.position = c(.40, .95),
legend.justification = c("right", "top"),
legend.box.just = "right",
legend.margin = margin(6, 6, 6, 6)) +
#geom_vline(xintercept = first_perturbation_day + 0.7,
# linetype="dotdash",
# color = "grey",
# size=0.7) +
geom_hline(yintercept = 0,
linetype = "dotted",
color = "black",
size = 0.7))
}
Let’s start from the full model (no mixed effect: meta-ecosystems have been pulled to create the lnRR):
\[ ln \: RR (bioarea \: density) = t + P + D + t*P + t*D + P*D \]
lnRR(bioarea density) = lnRR of the bioarea density (base level is calculated at each disturbance level and time point as the mean bioarea of the small isolated patches)
t = time
P = patch type
D = disturbance
We will exclude in the model the time point 0 and 1. At time point 0 all cultures were the same because that’s how we made them. At time point 1 no disturbance event had already taken place.
first_time_point = 2
last_time_point = 7
full_model = lm(bioarea_density_hedges_d ~
day +
eco_metaeco_type +
disturbance +
day * eco_metaeco_type +
day * disturbance +
eco_metaeco_type * disturbance,
data = ds_effect_size_bioarea_density %>%
filter(time_point >= first_time_point) %>%
filter(time_point <= last_time_point) %>%
filter(eco_metaeco_type== "S (S_S)" | eco_metaeco_type == "S (S_L)"))
Should we keep P?
no_P = lm(bioarea_density_hedges_d ~
day +
disturbance +
day * eco_metaeco_type +
day * disturbance +
eco_metaeco_type * disturbance,
data = ds_effect_size_bioarea_density %>%
filter(time_point >= first_time_point) %>%
filter(time_point <= last_time_point) %>%
filter(eco_metaeco_type== "S (S_S)" | eco_metaeco_type == "S (S_L)"))
AIC(full_model, no_P)
## df AIC
## full_model 8 50.96708
## no_P 8 50.96708
Yes.
Should we keep D?
no_D = lm(bioarea_density_hedges_d ~
day +
eco_metaeco_type +
day * eco_metaeco_type +
day * disturbance +
eco_metaeco_type * disturbance,
data = ds_effect_size_bioarea_density %>%
filter(time_point >= first_time_point) %>%
filter(time_point <= last_time_point) %>%
filter(eco_metaeco_type== "S (S_S)" | eco_metaeco_type == "S (S_L)"))
AIC(full_model, no_D)
## df AIC
## full_model 8 50.96708
## no_D 8 50.96708
Yes.
Should we keep t * P?
no_TP = lm(bioarea_density_hedges_d ~
day +
eco_metaeco_type +
disturbance +
day * disturbance +
eco_metaeco_type * disturbance,
data = ds_effect_size_bioarea_density %>%
filter(time_point >= first_time_point) %>%
filter(time_point <= last_time_point) %>%
filter(eco_metaeco_type== "S (S_S)" | eco_metaeco_type == "S (S_L)"))
AIC(full_model, no_TP)
## df AIC
## full_model 8 50.96708
## no_TP 7 57.36369
Yes.
Should we keep t * D?
no_TD = lm(bioarea_density_hedges_d ~
day +
eco_metaeco_type +
disturbance +
day * eco_metaeco_type +
eco_metaeco_type * disturbance,
data = ds_effect_size_bioarea_density %>%
filter(time_point >= first_time_point) %>%
filter(time_point <= last_time_point) %>%
filter(eco_metaeco_type== "S (S_S)" | eco_metaeco_type == "S (S_L)"))
AIC(full_model, no_TD)
## df AIC
## full_model 8 50.96708
## no_TD 7 55.31520
No.
Should we keep P * D?
no_PD = lm(bioarea_density_hedges_d ~
day +
eco_metaeco_type +
disturbance +
day * eco_metaeco_type,
data = ds_effect_size_bioarea_density %>%
filter(time_point >= first_time_point) %>%
filter(time_point <= last_time_point) %>%
filter(eco_metaeco_type== "S (S_S)" | eco_metaeco_type == "S (S_L)"))
AIC(no_TD, no_PD)
## df AIC
## no_TD 7 55.31520
## no_PD 6 58.48419
No.
Then our best model is:
\[ lnRR (bioarea \: density) = t + P + D + t*P \]
Let’s then do some model diagnostics.
best_model = no_PD
par(mfrow = c(2,3))
plot(best_model, which = 1:5)
R2_full = glance(best_model)$r.squared
no_patch_type = lm(bioarea_density_hedges_d ~
day +
disturbance,
data = ds_effect_size_bioarea_density %>%
filter(time_point >= first_time_point) %>%
filter(time_point <= last_time_point) %>%
filter(eco_metaeco_type== "S (S_S)" | eco_metaeco_type == "S (S_L)"))
R2_no_P = glance(no_patch_type)$r.squared
R2_P = R2_full - R2_no_P
R2_full = round(R2_full, digits = 2)
R2_P = round(R2_P, digits = 2)
The adjusted R squared of the model is 0.6 and the adjusted R squared of
patch type is 0.46 (which includes also its interaction with
disturbance).
Single time points: see modelling choices.
Research question: how does biomass density change according to the size to which the patch is connected? (i.e., does a large patch connected to a large patch have the same biomass density than a large patch connected to a small patch?)
for (disturbance_input in c("low", "high")) {
print(ds_biomass_abund %>%
filter(disturbance == disturbance_input) %>%
filter(patch_size == "L") %>%
ggplot(aes(x = day,
y = bioarea_per_volume,
group = system_nr,
fill = system_nr,
color = system_nr,
linetype = eco_metaeco_type)) +
geom_line(stat = "summary", fun = "mean") +
labs(x = "Day",
y = "Local bioarea (µm²/μl)",
title = paste("Disturbance =", disturbance_input),
linetype = "") +
theme_bw() +
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
legend.position = c(.95, .95),
legend.justification = c("right", "top"),
legend.box.just = "right",
legend.margin = margin(6, 6, 6, 6)) +
scale_linetype_discrete(labels = c("large isolated",
"large connected to large",
"large connected to small")) +
geom_vline(xintercept = first_perturbation_day,
linetype="dotdash",
color = "grey",
size = 0.7) +
labs(caption = "Vertical grey line: first perturbation"))}
for (disturbance_input in c("low", "high")){
print(ds_biomass_abund %>%
filter(disturbance == disturbance_input) %>%
filter(patch_size == "L") %>%
ggplot(aes(x = day,
y = bioarea_per_volume,
group = interaction(day,eco_metaeco_type),
fill = eco_metaeco_type)) +
geom_boxplot() +
labs(title = paste("Disturbance =", disturbance_input),
x = "Day",
y = "Local bioarea (µm²/μl)",
fill = "") +
theme_bw() +
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
legend.position = c(.95, .95),
legend.justification = c("right", "top"),
legend.box.just = "right",
legend.margin = margin(6, 6, 6, 6)) +
scale_fill_discrete(labels = c("large isolated",
"large connected to large",
"large connected to small")) +
geom_vline(xintercept = first_perturbation_day + 0.7,
linetype="dotdash",
color = "grey",
size=0.7) +
labs(caption = "Vertical grey line: first perturbation"))}
for (disturbance_input in c("low", "high")){
print(ds_effect_size_bioarea_density %>%
filter(!time_point == 0) %>% #At time point 0 all cultures were the same
filter(disturbance == disturbance_input) %>%
filter(eco_metaeco_type == "L (L_L)" | eco_metaeco_type == "L (S_L)") %>%
ggplot(aes(x = day,
y = bioarea_density_hedges_d,
color = eco_metaeco_type)) +
geom_point(position = position_dodge(0.5)) +
geom_line(position = position_dodge(0.5)) +
labs(title = paste("Disturbance =", disturbance_input),
x = "Day",
y = "hedge's d local bioarea",
color = "") +
#geom_errorbar(aes(ymin = lnRR_lower,
# ymax = lnRR_upper),
# width = .2,
# position = position_dodge(0.5)) +
scale_color_discrete(labels = c("large connected to large",
"large connnected to small")) +
theme_bw() +
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
legend.position = c(.90, .97),
legend.justification = c("right", "top"),
legend.box.just = "right",
legend.margin = margin(6, 6, 6, 6)) +
# geom_vline(xintercept = first_perturbation_day + 0.7,
# linetype="dotdash",
# color = "grey",
# size=0.7) +
geom_hline(yintercept = 0,
linetype = "dotted",
color = "black",
size = 0.7))}
We will exclude in the model the time point 0 and 1. At time point 0 all cultures were the same because that’s how we made them. At time point 1 no disturbance event had already taken place.
first_time_point = 2
last_time_point = 7
Let’s start from the full model (no mixed effect: meta-ecosystems have been pulled to create the lnRR):
\[ ln \: RR (bioarea \: density) = t + P + D + t*P + t*D + P*D \]
lnRR(bioarea density) = lnRR of the bioarea density (base level is calculated at each disturbance level and time point as the mean bioarea of the small isolated patches)
t = time
P = patch type
D = disturbance
full_model = lm(bioarea_density_hedges_d ~
day +
eco_metaeco_type +
disturbance +
day * eco_metaeco_type +
day * disturbance +
eco_metaeco_type * disturbance,
data = ds_effect_size_bioarea_density %>%
filter(time_point >= first_time_point) %>%
filter(time_point <= last_time_point) %>%
filter(eco_metaeco_type== "L (L_L)" |
eco_metaeco_type == "L (S_L)"))
Should we keep P?
no_P = lm(bioarea_density_hedges_d ~
day +
disturbance +
day * eco_metaeco_type +
day * disturbance +
eco_metaeco_type * disturbance,
data = ds_effect_size_bioarea_density %>%
filter(time_point >= first_time_point) %>%
filter(time_point <= last_time_point) %>%
filter(eco_metaeco_type== "L (L_L)" |
eco_metaeco_type == "L (S_L)"))
AIC(full_model, no_P)
## df AIC
## full_model 8 50.89774
## no_P 8 50.89774
Let’s see.
Should we keep D?
no_D = lm(bioarea_density_hedges_d ~
day +
eco_metaeco_type +
day * eco_metaeco_type +
day * disturbance +
eco_metaeco_type * disturbance,
data = ds_effect_size_bioarea_density %>%
filter(time_point >= first_time_point) %>%
filter(time_point <= last_time_point) %>%
filter(eco_metaeco_type== "L (L_L)" |
eco_metaeco_type == "L (S_L)"))
AIC(full_model, no_D)
## df AIC
## full_model 8 50.89774
## no_D 8 50.89774
Let’s see.
Should we keep t * P?
no_TP = lm(bioarea_density_hedges_d ~
day +
eco_metaeco_type +
disturbance +
day * disturbance +
eco_metaeco_type * disturbance,
data = ds_effect_size_bioarea_density %>%
filter(time_point >= first_time_point) %>%
filter(time_point <= last_time_point) %>%
filter(eco_metaeco_type== "L (L_L)" |
eco_metaeco_type == "L (S_L)"))
AIC(full_model, no_TP)
## df AIC
## full_model 8 50.89774
## no_TP 7 48.94521
No.
Should we keep t * D?
no_TD = lm(bioarea_density_hedges_d ~
day +
eco_metaeco_type +
disturbance +
eco_metaeco_type * disturbance,
data = ds_effect_size_bioarea_density %>%
filter(time_point >= first_time_point) %>%
filter(time_point <= last_time_point) %>%
filter(eco_metaeco_type== "L (L_L)" |
eco_metaeco_type == "L (S_L)"))
AIC(no_TP, no_TD)
## df AIC
## no_TP 7 48.94521
## no_TD 6 50.49800
Yes.
Should we keep P * D?
no_PD = lm(bioarea_density_hedges_d ~
day +
eco_metaeco_type +
disturbance +
day * disturbance,
data = ds_effect_size_bioarea_density %>%
filter(time_point >= first_time_point) %>%
filter(time_point <= last_time_point) %>%
filter(eco_metaeco_type== "L (L_L)" |
eco_metaeco_type == "L (S_L)"))
AIC(no_TP, no_PD)
## df AIC
## no_TP 7 48.94521
## no_PD 6 48.01583
No.
Then our best model is:
\[ lnRR (bioarea \: density) = t + P + D + tP \]
Let’s then do some model diagnostics.
best_model = no_PD
par(mfrow = c(2,3))
plot(best_model, which = 1:5)
R2_full = glance(best_model)$r.squared
no_patch_type = lm(bioarea_density_hedges_d ~
day +
disturbance +
day * disturbance,
data = ds_effect_size_bioarea_density %>%
filter(time_point >= first_time_point) %>%
filter(time_point <= last_time_point) %>%
filter(eco_metaeco_type== "L (L_L)" |
eco_metaeco_type == "L (S_L)"))
R2_no_P = glance(no_patch_type)$r.squared
R2_P = R2_full - R2_no_P
R2_full = round(R2_full, digits = 2)
R2_P = round(R2_P, digits = 2)
The adjusted R squared of the model is 0.35 and the adjusted R squared
of patch type is 0.08 (which includes also its interaction with
disturbance).
Single time points: see modelling choices.
Research question: how does biomass density change according to the size of isolated patches? (How does the biomass of small, medium, and large patches change?)
for (disturbance_input in c("low", "high")){
print(ds_biomass_abund %>%
filter ( disturbance == disturbance_input) %>%
filter(metaecosystem == "no") %>%
group_by (system_nr, day, patch_size_volume) %>%
summarise(mean_bioarea_per_volume_across_videos = mean(bioarea_per_volume)) %>%
ggplot (aes(x = day,
y = mean_bioarea_per_volume_across_videos,
group = system_nr,
fill = system_nr,
color = system_nr,
linetype = as.character(patch_size_volume))) +
geom_line () +
labs(title = paste("Disturbance =", disturbance_input),
x = "Day",
y = "Regional bioarea (µm²/µl)",
fill = "System nr",
linetype = "") +
scale_y_continuous(limits = c(0, 6250)) +
scale_x_continuous(limits = c(-2, 30)) +
scale_colour_continuous(guide = "none") +
theme_bw() +
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
legend.position = c(.95, .95),
legend.justification = c("right", "top"),
legend.box.just = "right",
legend.margin = margin(6, 6, 6, 6)) +
scale_linetype_discrete(labels = c("medium isolated",
"large isolated",
"small isolated")))}
for (disturbance_input in c("low", "high")){
print(ds_biomass_abund %>%
filter(disturbance == disturbance_input) %>%
filter(metaecosystem == "no") %>%
ggplot(aes(x = day,
y = bioarea_per_volume,
group = interaction(day, patch_size_volume),
fill = as.character(patch_size_volume))) +
geom_boxplot() +
labs(title = paste("Disturbance =", disturbance_input),
x = "Day",
y = "Local bioarea (µm²/μl)",
fill = "") +
scale_fill_discrete(labels = c("isolated medium",
"isolated large",
"isolated small")) +
theme_bw() +
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
legend.position = c(.95, .95),
legend.justification = c("right", "top"),
legend.box.just = "right",
legend.margin = margin(6, 6, 6, 6)))}
We will exclude in the model the time point 0 and 1. At time point 0 all cultures were the same because that’s how we made them. At time point 1 no disturbance event had already taken place.
Let’s see how linear is the time trend of bioarea and if we can make it more linear with a log10 transformation. We are lucky that during the modelling process we need to drop the first two time points, which would have made the biomass trend not linear.
Linearity of regional bioarea ~ time
ds_biomass_abund %>%
filter(time_point >= 2) %>%
ggplot(aes(x = day,
y = bioarea_per_volume,
group = day)) +
geom_boxplot() +
labs(x = "Day",
y = "Regional bioarea (µm²)")
linear_model = lm(bioarea_per_volume ~
day,
data = ds_biomass_abund %>%
filter(time_point >= 2) %>%
filter(metaecosystem == "no"))
par(mfrow=c(2,3))
plot(linear_model, which = 1:5)
Model selection
Let’ start from the full model.
\[ Local \: Bioarea \: Density = t + P + D + tP + tD + PD + tDP + (t | system \: nr) \]
full = lmer(bioarea_per_volume ~
day * patch_size_volume * disturbance +
(day | system_nr),
data = ds_biomass_abund %>%
filter(time_point >= 2) %>%
filter(metaecosystem == "no"),
REML = FALSE,
control = lmerControl(optimizer = "Nelder_Mead"))
Should we keep P (patch_size_volume)?
Need to figure this out.
Should we keep D (disturbance)?
Need to figure this out.
Should we keep the correlation in
(day | system_nr)?
no_correlation = lmer(bioarea_per_volume ~
day * patch_size_volume * disturbance +
(day || system_nr),
data = ds_biomass_abund %>%
filter(time_point >= 2) %>%
filter(metaecosystem == "no"),
REML = FALSE,
control = lmerControl(optimizer = "Nelder_Mead"))
anova(full, no_correlation)
## Data: ds_biomass_abund %>% filter(time_point >= 2) %>% filter(metaecosystem == ...
## Models:
## no_correlation: bioarea_per_volume ~ day * patch_size_volume * disturbance + ((1 | system_nr) + (0 + day | system_nr))
## full: bioarea_per_volume ~ day * patch_size_volume * disturbance + (day | system_nr)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## no_correlation 11 3659.1 3697.0 -1818.5 3637.1
## full 12 3657.7 3699.1 -1816.9 3633.7 3.3679 1 0.06648 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
No.
Should we keep the random effect of system nr on the time slopes
(day | system_nr)?
no_random_slopes = lmer(bioarea_per_volume ~
day * patch_size_volume * disturbance +
(1 | system_nr),
data = ds_biomass_abund %>%
filter(time_point >= 2) %>%
filter(metaecosystem == "no"),
REML = FALSE,
control = lmerControl(optimizer = "Nelder_Mead"))
anova(no_correlation, no_random_slopes)
## Data: ds_biomass_abund %>% filter(time_point >= 2) %>% filter(metaecosystem == ...
## Models:
## no_random_slopes: bioarea_per_volume ~ day * patch_size_volume * disturbance + (1 | system_nr)
## no_correlation: bioarea_per_volume ~ day * patch_size_volume * disturbance + ((1 | system_nr) + (0 + day | system_nr))
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## no_random_slopes 10 3657.1 3691.6 -1818.5 3637.1
## no_correlation 11 3659.1 3697.0 -1818.5 3637.1 0 1 1
No.
Should we keep t * M * D?
no_threeway = lmer(bioarea_per_volume ~
day +
patch_size_volume +
disturbance +
day : patch_size_volume +
day : disturbance +
patch_size_volume : disturbance +
(1 | system_nr),
data = ds_biomass_abund %>%
filter(time_point >= 2) %>%
filter(metaecosystem == "no"),
REML = FALSE,
control = lmerControl(optimizer = 'optimx',
optCtrl = list(method = 'L-BFGS-B')))
anova(no_random_slopes, no_threeway)
## Data: ds_biomass_abund %>% filter(time_point >= 2) %>% filter(metaecosystem == ...
## Models:
## no_threeway: bioarea_per_volume ~ day + patch_size_volume + disturbance + day:patch_size_volume + day:disturbance + patch_size_volume:disturbance + (1 | system_nr)
## no_random_slopes: bioarea_per_volume ~ day * patch_size_volume * disturbance + (1 | system_nr)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## no_threeway 9 3671.7 3702.7 -1826.9 3653.7
## no_random_slopes 10 3657.1 3691.6 -1818.5 3637.1 16.628 1 4.548e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
No.
Should we keep t * M?
no_TM = lmer(bioarea_per_volume ~
day +
patch_size_volume +
disturbance +
day : disturbance +
patch_size_volume : disturbance +
(1 | system_nr),
data = ds_biomass_abund %>%
filter(time_point >= 2) %>%
filter(metaecosystem == "no"),
REML = FALSE,
control = lmerControl(optimizer = "Nelder_Mead"))
anova(no_threeway,no_TM)
## Data: ds_biomass_abund %>% filter(time_point >= 2) %>% filter(metaecosystem == ...
## Models:
## no_TM: bioarea_per_volume ~ day + patch_size_volume + disturbance + day:disturbance + patch_size_volume:disturbance + (1 | system_nr)
## no_threeway: bioarea_per_volume ~ day + patch_size_volume + disturbance + day:patch_size_volume + day:disturbance + patch_size_volume:disturbance + (1 | system_nr)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## no_TM 8 3670.0 3697.6 -1827.0 3654.0
## no_threeway 9 3671.7 3702.7 -1826.9 3653.7 0.2687 1 0.6042
Yes.
Should we keep t * D?
no_TD = lmer(bioarea_per_volume ~
day +
patch_size_volume +
disturbance +
day : patch_size_volume +
patch_size_volume : disturbance +
(1 | system_nr),
data = ds_biomass_abund %>%
filter(time_point >= 2) %>%
filter(metaecosystem == "no"),
REML = FALSE,
control = lmerControl(optimizer = "Nelder_Mead"))
anova(no_threeway, no_TD)
## Data: ds_biomass_abund %>% filter(time_point >= 2) %>% filter(metaecosystem == ...
## Models:
## no_TD: bioarea_per_volume ~ day + patch_size_volume + disturbance + day:patch_size_volume + patch_size_volume:disturbance + (1 | system_nr)
## no_threeway: bioarea_per_volume ~ day + patch_size_volume + disturbance + day:patch_size_volume + day:disturbance + patch_size_volume:disturbance + (1 | system_nr)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## no_TD 8 3670.1 3697.7 -1827.0 3654.1
## no_threeway 9 3671.7 3702.7 -1826.9 3653.7 0.3821 1 0.5365
No.
Should we keep M * D?
no_MD = lmer(bioarea_per_volume ~
day +
patch_size_volume +
disturbance +
day : patch_size_volume +
(1 | system_nr),
data = ds_biomass_abund %>%
filter(time_point >= 2) %>%
filter(metaecosystem == "no"),
REML = FALSE,
control = lmerControl(optimizer = "Nelder_Mead"))
anova(no_TD, no_MD)
## Data: ds_biomass_abund %>% filter(time_point >= 2) %>% filter(metaecosystem == ...
## Models:
## no_MD: bioarea_per_volume ~ day + patch_size_volume + disturbance + day:patch_size_volume + (1 | system_nr)
## no_TD: bioarea_per_volume ~ day + patch_size_volume + disturbance + day:patch_size_volume + patch_size_volume:disturbance + (1 | system_nr)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## no_MD 7 3668.4 3692.5 -1827.2 3654.4
## no_TD 8 3670.1 3697.7 -1827.0 3654.1 0.2905 1 0.5899
No.
Best model
Therefore, our best model is:
\[ Regional \: bioarea = t + M + D + tM + (1 | system \: nr) \]
best_model = no_MD
Let’s do some model diagnostics:
plot(best_model)
qqnorm(resid(best_model))
The R squared of this model for t2-t7 are:
R2_marginal = r.squaredGLMM(best_model)[1]
R2_marginal = round(R2_marginal, digits = 2)
R2_conditional = r.squaredGLMM(best_model)[2]
R2_conditional = round(R2_conditional, digits = 2)
Marginal R2 = 0.71
Conditional R2 = 0.73
The inclusive R squares of this model are:
R2_isolated_patches = partR2(best_model,
partvars = c("day",
"patch_size_volume",
"disturbance"),
R2_type = "conditional",
nboot = 1000,
CI = 0.95)
saveRDS(R2_isolated_patches, file = here("results", "biomass", "R2_isolated_patches.RData"))
R2_regional = readRDS(here("results", "biomass", "R2_isolated_patches.RData"))
R2_regional$IR2
## # A tibble: 4 × 4
## term estimate CI_lower CI_upper
## <chr> <dbl> <dbl> <dbl>
## 1 day 0.00643 0.0000254 0.0533
## 2 patch_size_volume 0.00510 0.0000175 0.0525
## 3 disturbancelow 0.0112 0.0000276 0.0597
## 4 day:patch_size_volume 0.000806 0.00000631 0.0376
Let’s just assume that this model holds also for t2-t5. Then, let’s recalculate the R squared.
t2_t5 = lmer(bioarea_per_volume ~
day +
patch_size_volume +
disturbance +
day : patch_size_volume +
(1 | system_nr),
data = ds_biomass_abund %>%
filter(time_point >= 2) %>%
filter(time_point <= 5) %>%
filter(metaecosystem == "no"),
REML = FALSE,
control = lmerControl(optimizer = "Nelder_Mead"))
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
## Model failed to converge with max|grad| = 0.0052193 (tol = 0.002, component 1)
plot(t2_t5)
qqnorm(resid(t2_t5))
R2_marginal = r.squaredGLMM(t2_t5)[1]
R2_marginal = round(R2_marginal, digits = 2)
R2_conditional = r.squaredGLMM(t2_t5)[2]
R2_conditional = round(R2_conditional, digits = 2)
The R squared of this model for t2-t5 are:
Marginal R2 = 0.67
Conditional R2 = 0.68
The inclusive R squares of this model are:
R2_isolated_patches = partR2(t2_t5,
partvars = c("day",
"patch_size_volume",
"disturbance"),
R2_type = "conditional",
nboot = 1000,
CI = 0.95)
saveRDS(R2_isolated_patches, file = here("results", "biomass", "R2_isolated_patches_t2t5.RData"))
R2_regional = readRDS(here("results", "R2_isolated_patches_t2t5.RData"))
R2_regional$IR2
## # A tibble: 4 × 4
## term estimate CI_lower CI_upper
## <chr> <dbl> <dbl> <dbl>
## 1 day 0.356 0.263 0.460
## 2 patch_size_volume 0.313 0.212 0.416
## 3 disturbancelow 0.000809 0.00000421 0.0242
## 4 day:patch_size_volume 0.0399 0.00730 0.0956
I could compute the single time points but I don’t see the reason why I should do that.
for (disturbance_input in c("low", "high")) {
print(ds_biomass_abund %>%
group_by(culture_ID, disturbance, day, eco_metaeco_type) %>%
summarise(indiv_per_volume = mean(indiv_per_volume)) %>% #Average across videos
filter(disturbance == disturbance_input) %>%
ggplot(aes(x = day,
y = indiv_per_volume,
group = interaction(day, eco_metaeco_type),
fill = eco_metaeco_type)) +
geom_boxplot() +
labs(title = paste("Disturbance =", disturbance_input),
x = "Day",
y = "Community density (individuals/µl)",
fill = "") +
theme_bw() +
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
# legend.position = c(.99, .999),
# legend.justification = c("right", "top"),
# legend.box.just = "right",
legend.margin = margin(6, 6, 6, 6)) +
scale_fill_discrete(labels = c("large isolated",
"large connected to large",
"large connected to small",
"medium isolated",
"medium connected to medium",
"small isolated",
"small connected to large",
"small connected to small")) +
geom_vline(xintercept = first_perturbation_day + 0.6,
linetype="dotdash",
color = "grey",
size=0.7) +
labs(caption = "Vertical grey line: first perturbation"))
}
ds_abundance_total = ds_biomass_abund %>%
filter(!culture_ID %in% ecosystems_to_take_off) %>%
group_by(culture_ID,
system_nr,
disturbance,
time_point,
day,
patch_size,
patch_size_volume,
eco_metaeco_type) %>%
summarise(indiv_per_volume_video_averaged = mean(indiv_per_volume)) %>%
mutate(total_patch_bioarea = indiv_per_volume_video_averaged * patch_size_volume)
for (disturbance_input in c("low", "high")) {
print(ds_abundance_total %>%
filter(disturbance == disturbance_input) %>%
ggplot(aes(x = day,
y = total_patch_bioarea,
group = interaction(day, eco_metaeco_type),
fill = eco_metaeco_type)) +
geom_boxplot() +
labs(title = paste("Disturbance =", disturbance_input),
x = "Day",
y = "Community abundance (individuals)",
fill = "") +
theme_bw() +
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
# legend.position = c(.99, .999),
# legend.justification = c("right", "top"),
# legend.box.just = "right",
legend.margin = margin(6, 6, 6, 6)) +
scale_fill_discrete(labels = c("large isolated",
"large connected to large",
"large connected to small",
"medium isolated",
"medium connected to medium",
"small isolated",
"small connected to large",
"small connected to small")) +
geom_vline(xintercept = first_perturbation_day + 0.6,
linetype="dotdash",
color = "grey",
size=0.7) +
labs(caption = "Vertical grey line: first perturbation"))
}
Research question: how does community abundance change according to the size to which the patch is connected? (i.e., does a small patch connected to a small patch have the same community abundance than a small patch connected to a large patch?)
for (disturbance_input in c("low", "high")) {
print(ds_biomass_abund %>%
filter(disturbance == disturbance_input) %>%
filter(patch_size == "S") %>%
ggplot(aes(x = day,
y = bioarea_per_volume,
group = culture_ID,
fill = culture_ID,
color = culture_ID,
linetype = eco_metaeco_type)) +
geom_line(stat = "summary", fun = "mean") +
labs(title = paste("Disturbance =", disturbance_input),
x = "Day",
y = "Community density (individuals/μl)",
linetype = "") +
theme_bw() +
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
legend.position = c(.95, .95),
legend.justification = c("right", "top"),
legend.box.just = "right",
legend.margin = margin(6, 6, 6, 6)) +
scale_linetype_discrete(labels = c("small isolated",
"small connected to large",
"small connected to small")) +
geom_vline(xintercept = first_perturbation_day,
linetype = "dotdash",
color = "grey",
size = 0.7) +
labs(caption = "Vertical grey line: first perturbation"))}
for (disturbance_input in c("low", "high")) {
print(ds_biomass_abund %>%
filter(disturbance == disturbance_input) %>%
filter(patch_size == "S") %>%
ggplot(aes(x = day,
y = bioarea_per_volume,
group = interaction(day,eco_metaeco_type),
fill = eco_metaeco_type)) +
geom_boxplot() +
labs(title = paste("Disturbance =", disturbance_input),
x = "Day",
y = "Community density (individuals/μl)",
fill = "") +
theme_bw() +
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
legend.position = c(.95, .95),
legend.justification = c("right", "top"),
legend.box.just = "right",
legend.margin = margin(6, 6, 6, 6)) +
scale_fill_discrete(labels = c("small isolated",
"small connected to large",
"small connected to small")) +
geom_vline(xintercept = first_perturbation_day + 0.7,
linetype = "dotdash",
color = "grey",
size = 0.7) +
labs(caption = "Vertical grey line: first perturbation"))}
for (disturbance_input in c("low", "high")) {
print(ds_effect_size_community_density %>%
filter(!time_point == 0) %>% #At time point 0 all cultures were the same
filter(disturbance == disturbance_input) %>%
filter(eco_metaeco_type == "S (S_S)" | eco_metaeco_type == "S (S_L)") %>%
ggplot(aes(x = day,
y = community_density_hedges_d,
color = eco_metaeco_type)) +
geom_point(position = position_dodge(0.5)) +
geom_line(position = position_dodge(0.5)) +
labs(title = paste("Disturbance =", disturbance_input),
x = "Day",
y = "Community density effect size",
color = "") +
#geom_errorbar(aes(ymin = lnRR_lower,
# ymax = lnRR_upper),
# width = .2,
# position = position_dodge(0.5)) +
scale_color_discrete(labels = c("small connected to large",
"small connnected to small")) +
theme_bw() +
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
legend.position = c(.40, .95),
legend.justification = c("right", "top"),
legend.box.just = "right",
legend.margin = margin(6, 6, 6, 6)) +
#geom_vline(xintercept = first_perturbation_day + 0.7,
# linetype="dotdash",
# color = "grey",
# size=0.7) +
geom_hline(yintercept = 0,
linetype = "dotted",
color = "black",
size = 0.7))}
Let’s start from the full model (no mixed effect: meta-ecosystems have been pulled to create the lnRR):
\[ ln \: RR (community \: density) = t + P + D + t*P + t*D + P*D \]
lnRR(bioarea density) = lnRR of the bioarea density (base level is calculated at each disturbance level and time point as the mean bioarea of the small isolated patches)
t = time
P = patch type
D = disturbance
We will exclude in the model the time point 0 and 1. At time point 0 all cultures were the same because that’s how we made them. At time point 1 no disturbance event had already taken place.
first_time_point = 2
last_time_point = 7
full_model = lm(community_density_hedges_d ~
day +
eco_metaeco_type +
disturbance +
day * eco_metaeco_type +
day * disturbance +
eco_metaeco_type * disturbance,
data = ds_effect_size_community_density %>%
filter(time_point >= first_time_point) %>%
filter(time_point <= last_time_point) %>%
filter(eco_metaeco_type== "S (S_S)" | eco_metaeco_type == "S (S_L)"))
Should we keep t * P
(day * eco_metaeco_type)?
no_TP = lm(community_density_hedges_d ~
day +
eco_metaeco_type +
disturbance +
day * disturbance +
eco_metaeco_type * disturbance,
data = ds_effect_size_community_density %>%
filter(time_point >= first_time_point) %>%
filter(time_point <= last_time_point) %>%
filter(eco_metaeco_type== "S (S_S)" | eco_metaeco_type == "S (S_L)"))
AIC(full_model, no_TP)
## df AIC
## full_model 8 55.62436
## no_TP 7 57.46021
No.
Should we keep t * D (day * disturbance)?
no_TD = lm(community_density_hedges_d ~
day +
eco_metaeco_type +
disturbance +
eco_metaeco_type * disturbance,
data = ds_effect_size_community_density %>%
filter(time_point >= first_time_point) %>%
filter(time_point <= last_time_point) %>%
filter(eco_metaeco_type== "S (S_S)" | eco_metaeco_type == "S (S_L)"))
AIC(no_TP, no_TD)
## df AIC
## no_TP 7 57.46021
## no_TD 6 60.38239
No.
Should we keep P * D
(eco_metaeco_type * disturbance)?
no_PD = lm(community_density_hedges_d ~
day +
eco_metaeco_type +
disturbance,
data = ds_effect_size_community_density %>%
filter(time_point >= first_time_point) %>%
filter(time_point <= last_time_point) %>%
filter(eco_metaeco_type== "S (S_S)" | eco_metaeco_type == "S (S_L)"))
AIC(no_TD, no_PD)
## df AIC
## no_TD 6 60.38239
## no_PD 5 62.82164
Yes.
Then our best model is:
\[ lnRR (community \: density) = t + P + D + P*D \]
Let’s then do some model diagnostics.
best_model = no_TD
par(mfrow = c(2,3))
plot(best_model, which = 1:5)
R2_full = glance(best_model)$r.squared
no_patch_type = lm(community_density_hedges_d ~
day +
disturbance,
data = ds_effect_size_community_density %>%
filter(time_point >= first_time_point) %>%
filter(time_point <= last_time_point) %>%
filter(eco_metaeco_type== "S (S_S)" |
eco_metaeco_type == "S (S_L)"))
R2_no_P = glance(no_patch_type)$r.squared
R2_P = R2_full - R2_no_P
R2_full = round(R2_full, digits = 2)
R2_P = round(R2_P, digits = 2)
The adjusted R squared of the model is 0.52 and the adjusted R squared
of patch type is 0.4 (which includes also its interaction with
disturbance).
Single time points: see modelling choices.
Research question: how does community abundance change according to the size to which the patch is connected? (i.e., does a large patch connected to a large patch have the same community abundance than a large patch connected to a small patch?)
for (disturbance_input in c("low", "high")) {
print(ds_biomass_abund %>%
filter(disturbance == disturbance_input) %>%
filter(patch_size == "L") %>%
ggplot(aes(x = day,
y = indiv_per_volume,
group = system_nr,
fill = system_nr,
color = system_nr,
linetype = eco_metaeco_type)) +
geom_line(stat = "summary", fun = "mean") +
labs(x = "Day",
y = "Community density (individuals/μl)",
title = paste("Disturbance =", disturbance_input),
linetype = "") +
theme_bw() +
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
legend.position = c(.95, .95),
legend.justification = c("right", "top"),
legend.box.just = "right",
legend.margin = margin(6, 6, 6, 6)) +
scale_linetype_discrete(labels = c("large isolated",
"large connected to large",
"large connected to small")) +
geom_vline(xintercept = first_perturbation_day,
linetype="dotdash",
color = "grey",
size=0.7) +
labs(caption = "Vertical grey line: first perturbation"))}
for (disturbance_input in c("low", "high")){
print(ds_biomass_abund %>%
filter(disturbance == disturbance_input) %>%
filter(patch_size == "L") %>%
ggplot(aes(x = day,
y = indiv_per_volume,
group = interaction(day,eco_metaeco_type),
fill = eco_metaeco_type)) +
geom_boxplot() +
labs(title = paste("Disturbance =", disturbance_input),
x = "Day",
y = "Community density (individuals/μl)",
fill = "") +
theme_bw() +
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
legend.position = c(.95, .95),
legend.justification = c("right", "top"),
legend.box.just = "right",
legend.margin = margin(6, 6, 6, 6)) +
scale_fill_discrete(labels = c("large isolated",
"large connected to large",
"large connected to small")) +
geom_vline(xintercept = first_perturbation_day + 0.7,
linetype="dotdash",
color = "grey",
size=0.7) +
labs(caption = "Vertical grey line: first perturbation"))
}
for (disturbance_input in c("low", "high")){
print(ds_effect_size_community_density %>%
filter(!time_point == 0) %>% #At time point 0 all cultures were the same
filter(disturbance == disturbance_input) %>%
filter(eco_metaeco_type == "L (L_L)" | eco_metaeco_type == "L (S_L)") %>%
ggplot(aes(x = day,
y = community_density_hedges_d,
color = eco_metaeco_type)) +
geom_point(position = position_dodge(0.5)) +
geom_line(position = position_dodge(0.5)) +
labs(title = paste("Disturbance =", disturbance_input),
x = "Day",
y = "Community density effect size",
color = "") +
scale_color_discrete(labels = c("large connected to large",
"large connnected to small")) +
theme_bw() +
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
legend.position = c(.90, 1.01),
legend.justification = c("right", "top"),
legend.box.just = "right",
legend.margin = margin(6, 6, 6, 6)) +
# geom_vline(xintercept = first_perturbation_day + 0.7,
# linetype="dotdash",
# color = "grey",
# size=0.7) +
geom_hline(yintercept = 0,
linetype = "dotted",
color = "black",
size = 0.7))}
We will exclude in the model the time point 0 and 1. At time point 0 all cultures were the same because that’s how we made them. At time point 1 no disturbance event had already taken place.
first_time_point = 2
last_time_point = 7
Let’s start from the full model (no mixed effect: meta-ecosystems have been pulled to create the lnRR):
\[ ln \: RR (community \: density) = t + P + D + t*P + t*D + P*D \]
lnRR(bioarea density) = lnRR of the bioarea density (base level is calculated at each disturbance level and time point as the mean bioarea of the small isolated patches)
t = time
P = patch type
D = disturbance
full_model = lm(community_density_hedges_d ~
day +
eco_metaeco_type +
disturbance +
day * eco_metaeco_type +
day * disturbance +
eco_metaeco_type * disturbance,
data = ds_effect_size_community_density %>%
filter(time_point >= first_time_point) %>%
filter(time_point <= last_time_point) %>%
filter(eco_metaeco_type== "L (L_L)" |
eco_metaeco_type == "L (S_L)"))
Should we keep P?
I need to figure this out.
Should we keep D?
I need to figure this out.
Should we keep t * P?
no_TP = lm(community_density_hedges_d ~
day +
eco_metaeco_type +
disturbance +
day * disturbance +
eco_metaeco_type * disturbance,
data = ds_effect_size_community_density %>%
filter(time_point >= first_time_point) %>%
filter(time_point <= last_time_point) %>%
filter(eco_metaeco_type== "L (L_L)" |
eco_metaeco_type == "L (S_L)"))
AIC(full_model, no_TP)
## df AIC
## full_model 8 43.60508
## no_TP 7 44.52357
No.
Should we keep t * D?
no_TD = lm(community_density_hedges_d ~
day +
eco_metaeco_type +
disturbance +
eco_metaeco_type * disturbance,
data = ds_effect_size_community_density %>%
filter(time_point >= first_time_point) %>%
filter(time_point <= last_time_point) %>%
filter(eco_metaeco_type== "L (L_L)" |
eco_metaeco_type == "L (S_L)"))
AIC(no_TP, no_TD)
## df AIC
## no_TP 7 44.52357
## no_TD 6 56.97936
Yes.
Should we keep P * D?
no_PD = lm(community_density_hedges_d ~
day +
eco_metaeco_type +
disturbance +
day * disturbance,
data = ds_effect_size_community_density %>%
filter(time_point >= first_time_point) %>%
filter(time_point <= last_time_point) %>%
filter(eco_metaeco_type== "L (L_L)" |
eco_metaeco_type == "L (S_L)"))
AIC(no_TP, no_PD)
## df AIC
## no_TP 7 44.52357
## no_PD 6 42.55163
No.
Then our best model is:
\[ lnRR (community \: density) = t + P + D + tP \]
Let’s then do some model diagnostics.
best_model = no_PD
par(mfrow = c(2,3))
plot(best_model, which = 1:5)
R2_full = glance(best_model)$r.squared
no_patch_type = lm(community_density_hedges_d ~
day +
disturbance +
day * disturbance,
data = ds_effect_size_community_density %>%
filter(time_point >= first_time_point) %>%
filter(time_point <= last_time_point) %>%
filter(eco_metaeco_type== "L (L_L)" |
eco_metaeco_type == "L (S_L)"))
R2_no_P = glance(no_patch_type)$r.squared
R2_P = R2_full - R2_no_P
R2_full = round(R2_full, digits = 2)
R2_P = round(R2_P, digits = 2)
The adjusted R squared of the model is 0.65 and the adjusted R squared
of patch type is 0.01 (which includes also its interaction with
disturbance).
Single time points: see modelling choices.
Research question: how does community abundance change according to the size of isolated patches? (i.e., how does the biomass of small, medium, and large patches change?)
for (disturbance_input in c("low", "high")){
print(ds_biomass_abund %>%
filter ( disturbance == disturbance_input) %>%
filter(metaecosystem == "no") %>%
group_by (system_nr, day, patch_size) %>%
summarise(mean_indiv_per_volume_across_videos = mean(indiv_per_volume)) %>%
ggplot (aes(x = day,
y = mean_indiv_per_volume_across_videos,
group = system_nr,
fill = system_nr,
color = system_nr,
linetype = patch_size)) +
geom_line () +
labs(title = paste("Disturbance =", disturbance_input),
x = "Day",
y = "Community density (individuals/µl)",
fill = "System nr",
linetype = "") +
scale_colour_continuous(guide = "none") +
theme_bw() +
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
legend.position = c(.95, .95),
legend.justification = c("right", "top"),
legend.box.just = "right",
legend.margin = margin(6, 6, 6, 6)) +
scale_linetype_discrete(labels = c("large isolated",
"medium isolated",
"small isolated")))}
for (disturbance_input in c("low", "high")){
print(ds_biomass_abund %>%
filter(disturbance == disturbance_input) %>%
filter(metaecosystem == "no") %>%
ggplot(aes(x = day,
y = indiv_per_volume,
group = interaction(day, patch_size),
fill = patch_size)) +
geom_boxplot() +
labs(title = paste("Disturbance =", disturbance_input),
x = "Day",
y = "Community density (individuals/μl)",
fill = "") +
scale_fill_discrete(labels = c("isolated large",
"isolated medium",
"isolated small")) +
theme_bw() +
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
legend.position = c(.95, .95),
legend.justification = c("right", "top"),
legend.box.just = "right",
legend.margin = margin(6, 6, 6, 6)))}
We will exclude in the model the time point 0 and 1. At time point 0 all cultures were the same because that’s how we made them. At time point 1 no disturbance event had already taken place.
Let’s see how linear is the time trend of bioarea and if we can make it more linear with a log10 transformation. We are lucky that during the modelling process we need to drop the first two time points, which would have made the biomass trend not linear.
Linearity of regional bioarea ~ time
ds_biomass_abund %>%
filter(time_point >= 2) %>%
ggplot(aes(x = day,
y = indiv_per_volume,
group = day)) +
geom_boxplot() +
labs(x = "Day",
y = "Regional bioarea (µm²)")
linear_model = lm(indiv_per_volume ~
day,
data = ds_biomass_abund %>%
filter(time_point >= 2) %>%
filter(metaecosystem == "no"))
par(mfrow=c(2,3))
plot(linear_model, which = 1:5)
Model selection
Let’ start from the full model.
\[ Local \: Bioarea \: Density = t + P + D + tP + tD + PD + tDP + (t | system \: nr) \]
full = lmer(indiv_per_volume ~
day * patch_size_volume * disturbance +
(day | system_nr),
data = ds_biomass_abund %>%
filter(time_point >= 2) %>%
filter(metaecosystem == "no"),
REML = FALSE,
control = lmerControl(optimizer = "Nelder_Mead"))
Should we keep P (patch_size_volume)?
Need to figure this out.
Should we keep D (disturbance)?
Need to figure this out.
Should we keep the correlation in
(day | system_nr)?
no_correlation = lmer(indiv_per_volume ~
day * patch_size_volume * disturbance +
(day || system_nr),
data = ds_biomass_abund %>%
filter(time_point >= 2) %>%
filter(metaecosystem == "no"),
REML = FALSE,
control = lmerControl(optimizer = "Nelder_Mead"))
anova(full, no_correlation)
## Data: ds_biomass_abund %>% filter(time_point >= 2) %>% filter(metaecosystem == ...
## Models:
## no_correlation: indiv_per_volume ~ day * patch_size_volume * disturbance + ((1 | system_nr) + (0 + day | system_nr))
## full: indiv_per_volume ~ day * patch_size_volume * disturbance + (day | system_nr)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## no_correlation 11 -85.077 -47.163 53.538 -107.08
## full 12 -84.375 -43.014 54.188 -108.38 1.2986 1 0.2545
No.
Should we keep the random effect of system nr on the time slopes
(day | system_nr)?
no_random_slopes = lmer(indiv_per_volume ~
day * patch_size_volume * disturbance +
(1 | system_nr),
data = ds_biomass_abund %>%
filter(time_point >= 2) %>%
filter(metaecosystem == "no"),
REML = FALSE,
control = lmerControl(optimizer = "Nelder_Mead"))
anova(no_correlation, no_random_slopes)
## Data: ds_biomass_abund %>% filter(time_point >= 2) %>% filter(metaecosystem == ...
## Models:
## no_random_slopes: indiv_per_volume ~ day * patch_size_volume * disturbance + (1 | system_nr)
## no_correlation: indiv_per_volume ~ day * patch_size_volume * disturbance + ((1 | system_nr) + (0 + day | system_nr))
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## no_random_slopes 10 -87.077 -52.609 53.538 -107.08
## no_correlation 11 -85.077 -47.163 53.538 -107.08 0 1 1
No.
Should we keep t * M * D?
no_threeway = lmer(indiv_per_volume ~
day +
patch_size_volume +
disturbance +
day : patch_size_volume +
day : disturbance +
patch_size_volume : disturbance +
(1 | system_nr),
data = ds_biomass_abund %>%
filter(time_point >= 2) %>%
filter(metaecosystem == "no"),
REML = FALSE,
control = lmerControl(optimizer = 'optimx',
optCtrl = list(method = 'L-BFGS-B')))
anova(no_random_slopes, no_threeway)
## Data: ds_biomass_abund %>% filter(time_point >= 2) %>% filter(metaecosystem == ...
## Models:
## no_threeway: indiv_per_volume ~ day + patch_size_volume + disturbance + day:patch_size_volume + day:disturbance + patch_size_volume:disturbance + (1 | system_nr)
## no_random_slopes: indiv_per_volume ~ day * patch_size_volume * disturbance + (1 | system_nr)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## no_threeway 9 -65.532 -34.512 41.766 -83.532
## no_random_slopes 10 -87.077 -52.609 53.538 -107.077 23.544 1 1.221e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
No.
Should we keep t * M?
no_TM = lmer(indiv_per_volume ~
day +
patch_size_volume +
disturbance +
day : disturbance +
patch_size_volume : disturbance +
(1 | system_nr),
data = ds_biomass_abund %>%
filter(time_point >= 2) %>%
filter(metaecosystem == "no"),
REML = FALSE,
control = lmerControl(optimizer = "Nelder_Mead"))
anova(no_threeway,no_TM)
## Data: ds_biomass_abund %>% filter(time_point >= 2) %>% filter(metaecosystem == ...
## Models:
## no_TM: indiv_per_volume ~ day + patch_size_volume + disturbance + day:disturbance + patch_size_volume:disturbance + (1 | system_nr)
## no_threeway: indiv_per_volume ~ day + patch_size_volume + disturbance + day:patch_size_volume + day:disturbance + patch_size_volume:disturbance + (1 | system_nr)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## no_TM 8 -66.433 -38.859 41.217 -82.433
## no_threeway 9 -65.532 -34.512 41.766 -83.532 1.0989 1 0.2945
Yes.
Should we keep t * D?
no_TD = lmer(indiv_per_volume ~
day +
patch_size_volume +
disturbance +
day : patch_size_volume +
patch_size_volume : disturbance +
(1 | system_nr),
data = ds_biomass_abund %>%
filter(time_point >= 2) %>%
filter(metaecosystem == "no"),
REML = FALSE,
control = lmerControl(optimizer = "Nelder_Mead"))
anova(no_threeway, no_TD)
## Data: ds_biomass_abund %>% filter(time_point >= 2) %>% filter(metaecosystem == ...
## Models:
## no_TD: indiv_per_volume ~ day + patch_size_volume + disturbance + day:patch_size_volume + patch_size_volume:disturbance + (1 | system_nr)
## no_threeway: indiv_per_volume ~ day + patch_size_volume + disturbance + day:patch_size_volume + day:disturbance + patch_size_volume:disturbance + (1 | system_nr)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## no_TD 8 -67.299 -39.725 41.649 -83.299
## no_threeway 9 -65.532 -34.512 41.766 -83.532 0.2335 1 0.6289
No.
Should we keep M * D?
no_MD = lmer(indiv_per_volume ~
day +
patch_size_volume +
disturbance +
day : patch_size_volume +
(1 | system_nr),
data = ds_biomass_abund %>%
filter(time_point >= 2) %>%
filter(metaecosystem == "no"),
REML = FALSE,
control = lmerControl(optimizer = "Nelder_Mead"))
anova(no_TD, no_MD)
## Data: ds_biomass_abund %>% filter(time_point >= 2) %>% filter(metaecosystem == ...
## Models:
## no_MD: indiv_per_volume ~ day + patch_size_volume + disturbance + day:patch_size_volume + (1 | system_nr)
## no_TD: indiv_per_volume ~ day + patch_size_volume + disturbance + day:patch_size_volume + patch_size_volume:disturbance + (1 | system_nr)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## no_MD 7 -65.562 -41.435 39.781 -79.562
## no_TD 8 -67.299 -39.725 41.649 -83.299 3.7363 1 0.05324 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
No.
Best model
Therefore, our best model is:
\[ Regional \: bioarea = t + M + D + tM + (1 | system \: nr) \]
best_model = no_MD
Let’s do some model diagnostics:
plot(best_model)
qqnorm(resid(best_model))
The R squared of this model for t2-t7 are:
R2_marginal = r.squaredGLMM(best_model)[1]
R2_marginal = round(R2_marginal, digits = 2)
R2_conditional = r.squaredGLMM(best_model)[2]
R2_conditional = round(R2_conditional, digits = 2)
Marginal R2 = 0.69
Conditional R2 = 0.7
The inclusive R squares of this model are:
R2_isolated_patches = partR2(best_model,
partvars = c("day",
"patch_size_volume",
"disturbance"),
R2_type = "conditional",
nboot = 1000,
CI = 0.95)
saveRDS(R2_isolated_patches, file = here("results", "biomass", "R2_isolated_patches.RData"))
R2_regional = readRDS(here("results", "biomass", "R2_isolated_patches.RData"))
R2_regional$IR2
## # A tibble: 4 × 4
## term estimate CI_lower CI_upper
## <chr> <dbl> <dbl> <dbl>
## 1 day 0.00643 0.0000254 0.0533
## 2 patch_size_volume 0.00510 0.0000175 0.0525
## 3 disturbancelow 0.0112 0.0000276 0.0597
## 4 day:patch_size_volume 0.000806 0.00000631 0.0376
Let’s just assume that this model holds also for t2-t5. Then, let’s recalculate the R squared.
t2_t5 = lmer(indiv_per_volume ~
day +
patch_size_volume +
disturbance +
day : patch_size_volume +
(1 | system_nr),
data = ds_biomass_abund %>%
filter(time_point >= 2) %>%
filter(time_point <= 5) %>%
filter(metaecosystem == "no"),
REML = FALSE,
control = lmerControl(optimizer = "Nelder_Mead"))
plot(t2_t5)
qqnorm(resid(t2_t5))
R2_marginal = r.squaredGLMM(t2_t5)[1]
R2_marginal = round(R2_marginal, digits = 2)
R2_conditional = r.squaredGLMM(t2_t5)[2]
R2_conditional = round(R2_conditional, digits = 2)
The R squared of this model for t2-t5 are:
Marginal R2 = 0.65
Conditional R2 = 0.69
The inclusive R squares of this model are:
R2_isolated_patches = partR2(t2_t5,
partvars = c("day",
"patch_size_volume",
"disturbance"),
R2_type = "conditional",
nboot = 1000,
CI = 0.95)
saveRDS(R2_isolated_patches, file = here("results", "R2_isolated_patches_t2t5.RData"))
R2_regional = readRDS(here("results", "R2_isolated_patches_t2t5.RData"))
R2_regional$IR2
## # A tibble: 4 × 4
## term estimate CI_lower CI_upper
## <chr> <dbl> <dbl> <dbl>
## 1 day 0.356 0.263 0.460
## 2 patch_size_volume 0.313 0.212 0.416
## 3 disturbancelow 0.000809 0.00000421 0.0242
## 4 day:patch_size_volume 0.0399 0.00730 0.0956
I could compute the single time points but I don’t see the reason why I should do that.
include_graphics(here("gifs", "transition_day_L_low.gif"))